WO2017035368A1 - System and method for interactive video conferencing - Google Patents

System and method for interactive video conferencing Download PDF

Info

Publication number
WO2017035368A1
WO2017035368A1 PCT/US2016/048720 US2016048720W WO2017035368A1 WO 2017035368 A1 WO2017035368 A1 WO 2017035368A1 US 2016048720 W US2016048720 W US 2016048720W WO 2017035368 A1 WO2017035368 A1 WO 2017035368A1
Authority
WO
WIPO (PCT)
Prior art keywords
computing device
user computing
processor
audio
video
Prior art date
Application number
PCT/US2016/048720
Other languages
French (fr)
Inventor
Edo Segal
Charles Miller
Dmytro Panin
Original Assignee
Touchcast LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Touchcast LLC filed Critical Touchcast LLC
Priority to EP16840132.1A priority Critical patent/EP3342158A4/en
Publication of WO2017035368A1 publication Critical patent/WO2017035368A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/1813Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/1813Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
    • H04L12/1827Network arrangements for conference optimisation or adaptation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences

Definitions

  • the present application relates, generally, to content presentation and, more particularly, to a system and method for providing and interacting with content via interactive communication sessions.
  • Interactive and supplemental content that has been made available to viewers has been done through a decoupled, separate communication channel.
  • a producer can provide a separate communication channel with data, a video stream, or both at a URL associated with the broadcast.
  • a television station can have on-air programming and also provide supplemental content available through a website.
  • these communication channels are generally decoupled from one another. In other words, the broadcaster has only an indirect relationship to the viewer with regard to any supplemental content.
  • a networked communication session is defined vis-a-vis parameter information received from a first device, and access to the networked communication session is provided to the first device. Access to the networked communication session can be further provided to each of additional device(s) in accordance with information associated with each of the additional device(s).
  • a composited interactive audio/video feed is generated and transmitted that comprises audio/video input received during the networked communication session from each of the devices.
  • Each of the devices is configured via a gesture-based user interface to enable user- based selection and respective manipulation of at least a portion of the composited audio/video feed that results in a customized representation of the composited audio- video feed at a particular one device that is not represented at a particular other of the device(s).
  • at least one of the additional user computing device(s) is configured to present the composited audio/video feed on a display via an Internet media extender, wherein the display can be a television.
  • At least one processor receives from the first user computing device, information representing at least one video application ("vApp") that, when executed by the first user computing device and/or the at least one additional user computing device, results in functionality, information and/or content to be provided during the networked communication session.
  • vApp identification information representing the at least one vApp is transmitted is transmitted to at least one additional user computing device, and is usable to configure the at least one additional user computing device to execute the vApp.
  • the at least one processor receives from the first user computing device and/or the at least one additional user computing device, information representing the at least one other user computing device and providing, by the at least one processor, the respective access to the networked communication session to each of the at least one additional user computing device as a function of the information representing the at least one other user computing device.
  • FIG. 1 is a diagram illustrating an example hardware arrangement that operates for providing the systems and methods disclosed herein;
  • FIG. 2 is a block diagram that illustrates functional elements of a computing device in accordance with an embodiment
  • Fig. 3 is an example high-level diagram that illustrates interactivity between various ones of the devices illustrated in Fig. 1 ;
  • FIG. 4 illustrates an example implantation of an alternate arrangement
  • Fig. 6 illustrates a portion of a display screen that includes a graphical screen control that is selectable for transmitting an invitation to one or more user(s) to join an interactive conferencing session in accordance with an example implementation of the present application;
  • Fig. 7 illustrates a portion of a display screen that includes graphical screen controls for adding sharable content, muting a microphone and recording a session in accordance with an example implementation of the present application;
  • Fig 8 illustrates an example high-level diagram of an enterprise implementation, including a plurality of user computing devices, an Internet media extender and a television;
  • FIGs. 9A - 9D illustrate examples of sharing content between devices, in accordance with an example implementation of the present application
  • FIG. 10 illustrates an example set of applications and controls available via a mobile application and operating on user computing device during an interactive video conferencing session, in accordance with an example implementation of the present application
  • FIGs. 11 A-l IE illustrate examples of sharing content between devices, in accordance with an example implementation of the present application
  • FIGs. 12A-12B illustrate examples of sharing content between devices, in accordance with an example implementation of the present application
  • FIGs. 13A-13B illustrate an example user computing device configured and operational in accordance with the present application.
  • Fig. 14 is an example flow diagram illustrating example steps associated with an implementation of the present application. Detailed Description
  • the present application provides systems and methods for providing interactive video conferencing over one or more data communication networks, such as the Internet.
  • Devices operating, for example, iOS, ANDROID, WINDOWS MOBILE, BLACKBERRY, MAC OS, WINDOWS or other operating systems are configured with one or more software applications that provide functionality, such as with an interface for developing ("authoring") distributable coordinated presentations.
  • Such presentations can include interactive video having customizable and interactive functionality for and between devices with a plurality of end-users who receive the video.
  • the one or more software applications configure a user computing device with a viewing/interactive tool, referred to herein, generally, as a "consuming" interface for end-users who receive interactive video that are authored in accordance with the present application and usable for end-users to communicate (e.g., via interactive video conferencing functionality).
  • a viewing/interactive tool referred to herein, generally, as a "consuming" interface for end-users who receive interactive video that are authored in accordance with the present application and usable for end-users to communicate (e.g., via interactive video conferencing functionality).
  • client interface users may interact with each other and share interactive videos and other content as a function of touch and gestures, as well as graphical screen controls that, when selected, cause a computing device to execute one or more instructions and effect various functionality.
  • a smartphone or other mobile computing device can be configured via one or more applications in accordance with the ability to simulate a laser pointer, drawing tool, mouse, trackball, keyboard or other input device.
  • implementations of the present application provide a simple to use, informing and entertaining communications experience that incorporates content from a plurality of computing devices, e.g., smartphones, tablets, laptops and desktops, and enables live sharing in a real-time and conferencing capability therefore.
  • one or more televisions can be used for respective audio/visual display devices, and can provide feed from cameras and/or microphones configured with various local and/or remotely located computing devices that are communicating over data communication networks such as the Internet.
  • a television can be implemented in the present application in various ways, such as via an Internet media extender provided by APPLE TV, ROKU, AMAZON FIRE TV or GOOGLE
  • an Internet media extender refers, generally, to a category of devices that provide for content to be streamed to a television, surround sound devices, and the like. Unlike functionality provided by known Internet media extenders, however, the present application facilitates integrating audio/video input capabilities of computing devices (e.g., microphones, cameras and software that drive and enhance audio/visual captures) into video-conferencing capabilities. The present application facilitates one or more of: one-to- one (1 : 1) video conferencing; group video conferencing; sharing and/or viewing of content provided on a plurality of computing devices, and interactive computing activities.
  • computing devices e.g., microphones, cameras and software that drive and enhance audio/visual captures
  • content which can be formatted as and/or include images, audio/video content, website content, computer programs and/or content provided in various formats (collectively referred to herein, generally, as "vApps"), can be implemented vis-a-vis one or more mobile software applications.
  • vApp icons can be provided that represent vApps that are included with the conferencing sessions.
  • the user can interact with the respective vApp.
  • Functionality, information and/or content can be associated with the vApp and provided in a shared conferencing session, which is made available to user computing devices connected thereto.
  • the present application provides for interactive video conferencing that integrates audio/video input and output from individual mobile computing devices (e.g., smartphones and tablet computers) with Internet media extender devices (e.g., APPLE TV).
  • individual mobile computing devices e.g., smartphones and tablet computers
  • Internet media extender devices e.g., APPLE TV.
  • the present application provides a new form of live and interactive functionality that can make a person's living room or other residential viewing area into a high-end video conferencing suite.
  • Non-residential implementations are supported, as well, as shown and described in greater detail herein.
  • online collaborative services are provided, for example, including for webinars, webcasts, and meetings.
  • Internet technologies such as TCP/IP connectivity support web conferencing services, including sharing of audio, video, textual and various forms of multimedia content.
  • System 100 can include one or more data processing apparatuses 102 that are at least communicatively coupled to one or more user computing devices 104 across communication network 106.
  • Data processing apparatuses 102 and user computing devices 104 can include, for example, mobile computing devices such as tablet computing devices, smartphones, personal digital assistants or the like, as well as laptop computers and/or desktop computers.
  • one computing device may be configured as a data processing apparatus 102 and a user computing device 104, depending upon operations be executed at a particular time.
  • an audio/visual capture device 105 is depicted in Fig.
  • the audio/visual capture device 105 can be configured to interface with one or more data processing apparatuses 102 for producing high quality and interactive multimedia content, and supporting interactive video conferencing.
  • data processing apparatus 102 can be configured to access one or more databases for the present application, including image files, video content, documents, audio/video recordings, metadata and other information.
  • data processing apparatus 102 can access any required databases via communication network 106 or any other communication network to which data processing apparatus 102 has access.
  • Data processing apparatus 102 can communicate with devices comprising databases using any known communication method, including a direct serial, parallel, universal serial bus (“USB”) interface, or via a local or wide area network.
  • USB universal serial bus
  • Communication network 106 can be any communication network, but is typically the Internet or some other global computer network.
  • Data connections 108 can be any known arrangement for accessing communication network 106, such as the public internet, private Internet (e.g. VPN), dedicated Internet connection, or dial-up serial line interface
  • SLIPP/PPP protocol/point-to-point protocol
  • ISDN integrated services digital network
  • DSL digital subscriber line
  • ATM asynchronous transfer mode
  • User computing devices 104 preferably have the ability to send and receive data across communication network 106, and are equipped with web browsers, software applications, or other means, to provide received data on display devices incorporated therewith.
  • user computing device 104 may be personal computers such as Intel Pentium-class and Intel Core-class computers or Apple Macintosh computers, tablets, smartphones, but are not limited to such computers.
  • Other computing devices which can communicate over a global computer network such as palmtop computers, personal digital assistants (PDAs) and mass-marketed Internet access devices such as WebTV can be used.
  • PDAs personal digital assistants
  • the hardware arrangement of the present invention is not limited to devices that are physically wired to communication network 106, and that wireless communication can be provided between wireless devices and data processing apparatuses 102.
  • system 100 can include Internet media extender 110 that is communicatively coupled to television 112, such as via a high-definition multimedia interface (“HDMI”) or other connection.
  • HDMI high-definition multimedia interface
  • user computing device 104 provides user access to data processing apparatus 102 for the purpose of receiving and providing information.
  • data processing apparatus 102 The specific functionality provided by system 100, and in particular data processing apparatuses 102, is described in detail below.
  • System 100 preferably includes software that provides functionality described in greater detail herein, and preferably resides on one or more data processing apparatuses 102 and/or user computing devices 104.
  • One of the functions performed by data processing apparatus 102 is that of operating as a web server and/or a web site host.
  • Data processing apparatuses 102 typically communicate with communication network 106 across a permanent i.e., un-switched data connection 108. Permanent connectivity ensures that access to data processing apparatuses 102 is always available.
  • FIG. 2 illustrates, in block diagram form, an exemplary data processing apparatus 102 and/or user computing device 104 that can provide various functionality, as shown and described herein. Although not expressly indicated, one or more features shown and described with reference with Fig. 2 can be included with or in the audio/visual capture device 105, as well.
  • Data processing apparatus 102 and/or user computing device 104 may include one or more microprocessors 205 and connected system components (e.g., multiple connected chips) or the data processing apparatus 102 and/or user computing device 104 may be a system on a chip.
  • the data processing apparatus 102 and/or user computing device 104 includes memory 210 which is coupled to the microprocessor(s) 205.
  • the memory 210 may be used for storing data, metadata, and programs for execution by the microprocessor(s) 205.
  • the memory 210 may include one or more of volatile and non- volatile memories, such as Random Access Memory (“RAM”), Read Only Memory (“ROM”), Flash, Phase Change Memory (“PCM”), or other type.
  • RAM Random Access Memory
  • ROM Read Only Memory
  • PCM Phase Change Memory
  • the data processing apparatus 102 and/or user computing device 104 also includes an audio input/output subsystem 215 which may include one or more microphones and/or speakers.
  • a display controller and display device 220 provides a visual user interface for the user; this user interface may include a graphical user interface which, for example, is similar to that shown on a Macintosh computer when running Mac OS operating system software or an iPad, iPhone, or similar device when running iOS operating system software.
  • the data processing apparatus 102 and/or user computing device 104 also includes one or more wireless transceivers 230, such as an IEEE 802.11 transceiver, an infrared transceiver, a Bluetooth transceiver, a wireless cellular telephony transceiver (e.g., 1G, 2G, 3G, 4G), or another wireless protocol to connect the data processing system 100 with another device, external component, or a network.
  • wireless transceivers 230 such as an IEEE 802.11 transceiver, an infrared transceiver, a Bluetooth transceiver, a wireless cellular telephony transceiver (e.g., 1G, 2G, 3G, 4G), or another wireless protocol to connect the data processing system 100 with another device, external component, or a network.
  • Gyroscope/ Accelerometer 235 can be provided.
  • one or more buses may be used to interconnect the various modules in the block diagram shown in Fig. 2.
  • the data processing apparatus 102 and/or user computing device 104 may be a personal computer, tablet-style device, such as an iPad, a personal digital assistant (PDA), a cellular telephone with PDA-like functionality, such as an iPhone, a Wi-Fi based telephone, a handheld computer which includes a cellular telephone, a media player, such as an iPod, an entertainment system, such as a iPod touch, or devices which combine aspects or functions of these devices, such as a media player combined with a PDA and a cellular telephone in one device.
  • the data processing apparatus 102 and/or user computing device 104 may be a network computer or an embedded processing apparatus within another device or consumer electronic product.
  • the data processing apparatus 102 and/or user computing device 104 also includes one or more input or output (“I/O") devices and interfaces 225 which are provided to allow a user to provide input to, receive output from, and otherwise transfer data to and from the system.
  • I/O devices may include a mouse, keypad or a keyboard, a touch panel or a multi-touch input panel, camera, network interface, modem, other known I/O devices or a combination of such I/O devices.
  • the touch input panel may be a single touch input panel which is activated with a stylus or a finger or a multi-touch input panel which is activated by one finger or a stylus or multiple fingers, and the panel is capable of distinguishing between one or two or three or more touches and is capable of providing inputs derived from those touches to the data processing apparatus 102 and/or user computing device 104.
  • the I/O devices and interfaces 225 may include a connector for a dock or a connector for a USB interface, Fire Wire, etc. to connect the system 100 with another device, external component, or a network.
  • the I/O devices and interfaces can include gyroscope and/or accelerometer 227, which can be configured to detect 3-axis angular acceleration around the X, Y and Z axes, enabling precise calculation, for example, of yaw, pitch, and roll.
  • the gyroscope and/or accelerometer 227 can be configured as a sensor that detects acceleration, shake, vibration shock, or fall of a device 102/104, for example, by detecting linear acceleration along one of three axes (X, Y and Z).
  • the gyroscope can work in conjunction with the accelerometer, to provide detailed and precise information about the device's axial movement in space.
  • the 3 axes of the gyroscope combined with the 3 axes of the accelerometer enable the device to recognize approximately how far, fast, and in which direction it has moved to generate telemetry information associated therewith, and that is processed to generate coordinated presentations, such as shown and described herein.
  • Additional components can also be part of the data processing apparatus 102 and/or user computing device 104, and, in certain embodiments, fewer components than that shown in Fig. 2 may also be used in data processing apparatus 102 and/or user computing device 104.
  • aspects of the inventions may be embodied, at least in part, in software. That is, the computer- implemented methods may be carried out in a computer system or other data processing system in response to its processor or processing system executing sequences of instructions contained in a memory, such as memory 210 or other machine -readable storage medium.
  • the software may further be transmitted or received over a network (not shown) via a network interface device 225.
  • hardwired circuitry may be used in combination with the software instructions to implement the present embodiments.
  • the techniques are not limited to any specific combination of hardware circuitry and software, or to any particular source for the instructions executed by the data processing apparatus 102 and/or user computing device 104.
  • the present application provides improved processing techniques to prevent packet loss, to improve handling interruptions in communications, to reduce or eliminate latency and other issues associated with wireless technology.
  • Real Time Streaming Protocol can be implemented, for example, for sharing output associated with a camera, microphone and/or other output devices configured with a computing device.
  • RTSP is an effective (though not necessary in all implementations) network control protocol for entertainment and communications systems, including in connection with streaming output.
  • RTSP is used in the present application, at least in part, for establishing and controlling media sessions between various end points, including user computing devise 104, Internet media extender 110 and data processing apparatus 102.
  • one or more implementations of the present application can be configured to use Web Real-Time Communication (“WebRTC”) to support browser-to- browser applications, including in connection with voice, video chat, and peer-to-peer (“P2P”) file sharing.
  • WebRTC Web Real-Time Communication
  • P2P peer-to-peer
  • the present application avoids a need for either internal or external plugins to connect endpoints, including for voice/video or other communication sharing.
  • the present application implements WebRTC for applications and/or Internet web sites to capture and/or stream audio and/or video media, as well as to exchange data between browsers without requiring an intermediary.
  • the set of standards that comprises WebRTC makes it possible to share data and perform
  • WebRTC includes several interrelated APIs and protocols which work together.
  • At least one of the Internet media extender components 110 includes APPLE TV.
  • an Internet media extender 110 is installed (e.g., connected to a television set and connected to a Wi-Fi, Ethernet or other local area network)
  • a software application is installed on the Internet media extender 110, as well as at least one mobile computing device 104.
  • TV APP Internet media extender 110
  • MOBILE APP user computing device 104
  • the mobile computing device 104 e.g., an iPhone
  • the TV APP e.g., an iPhone
  • video content that is provided as a function audio/video output from the computing device (e.g., iPhone) is provided instantly on the television that is connected to the Internet media extender 110.
  • audio/video feed from the iPhone is provided on big screen.
  • the TV APP and the MOBILE APP may be configured as a single application (e.g., distributed as a single application), or may be provided as separate applications.
  • each of a plurality of participants operating, for example, user computing device 104 participate in an interactive video conference at least in part by establishing a data/communication session with the data processing apparatus 102.
  • a form of a star topology is established, in which data processing apparatus 102 is
  • the present application can implement a star topology in which a central node (e.g., a data processing apparatus 102) receives low resolution of video content from each of a plurality of computing devices (e.g., client devices 104).
  • the central node can be configured by executing program instructions to compose a single video comprising all of the video received from the various devices.
  • the single video can be provided substantially in real-time as one high-definition (“HD") video.
  • the central node can send the HD video to all of the computing devices operated by the various users, as well as to the device operated by the "presenter.”
  • each of the respective individual feeds from the respective devices is received by the data processing apparatus 102 and the video feeds (including, for example, images) are composed into a single video stream.
  • the video stream can be configured as a high definition stream (e.g., 1280x720 or higher resolution), and output to each of at least some of the respective user computing devices 104.
  • Fig. 3 is an example high-level diagram that illustrates interactivity between various ones of the devices illustrated in Fig. 1 , and identifies example communication protocols in one or more implementations of the present application.
  • the implementation illustrated in Fig. 3 is usable as a consumer (e.g., a residential) implementation, as well as an enterprise implementation.
  • WebRTC is shown with regard to communications between user computing devices 104 (shown as a CHROME BOOK and mobile computing device, e.g., a smart phone) and supporting browser-to-browser applications and P2P functionality.
  • RTSP is utilized in connection with user computing devices 104 and Internet media extender 110, thereby enabling presentation of audio/video content from devices 104 on television 112.
  • HTTP Live Streaming is utilized for HTTP-based media streaming.
  • HLS is a usable to parse a stream into a sequence of small HTTP-based file downloads, each download comprising a portion of the stream.
  • a client device can select from a number of different alternate streams containing the same material encoded at a variety of data rates, allowing the streaming session to adapt to an available data rate.
  • a M3U playlist containing the metadata for the various sub-streams which are available for download is also provided and downloaded.
  • a respective computing device 104 is illustrated as the "origin” (also referred to herein, generally, as a “presenter” device) that is executing the MOBILE APP and defining a session for respective other devices to use for, for example, interactive video conferencing.
  • the "origin" device 104 establishes an initial session and options are provided to invite other users (e.g., user computing devices 104 that are configured with the MOBILE APP) to join the conferencing session. Users who are invited to join the session can further invite other users, for example, when permission for extending invitations to other users has been made available (e.g., enabled) by the origin device.
  • the MOBILE APP can include instructions for enabling a processor associated with each respective user computing device 104 to generate and transmit invitations for users to join a respective conferencing session.
  • video sharing, video conferencing, sharing of multimedia content, data, documents and various files is supported, as shown and described in greater detail herein.
  • a plurality of interactive communication sessions can be defined by an origin device.
  • Each of the respective sessions can be defined and identified using a specific title or label. For example, "#APPDEV" can be used to define and identify an interactive communication session having a topic dedicated to software application development.
  • the origin device can, thereafter, transmit invitations to computing devices 104 associated with software engineers and other relevant parties to join the
  • Videoconferencing technology in accordance with the present application is, thereafter, available for the software engineers and other relevant parties and the user of the origin device 104, such as to confer over topics associated with software application development.
  • the same respective origin device 104 can define another respective session with a different topical focus, e.g., sales and entitled #SALES.
  • invitations to the #SALES session can be transmitted by the origin device to computing devices 104 associated with individuals in a sales and marketing department.
  • Videoconferencing technology in accordance with the present application is, thereafter, available for those one or more individuals to confer about topics associated with sales. In one or more
  • At least part of the respective groups of users is mutually exclusive, in that members of the "#APPDEV" session cannot participate in the #SALES session, and at least some of the members of the #SALES session cannot participate in the #APPDEV session.
  • an invitation is sent to a user of a computing device 104 for joining a session defined by a respective topic (e.g., #TOPIC)
  • the user affirmatively accepts the invitation and is, thereafter, authorized to join the session.
  • the user can select, via the MOBILE APP, an identifier representing the respect session (e.g., #TOPIC) provided, which causes the user computing device 104 to execute one or more instructions that enable the device 104 to connect to and/or join the session, and access materials associated therewith.
  • rules can be defined and/or enforced that restrict access to sessions and/or content to respective users.
  • a session defined as #TOPIC may be shared by seven users, however rules can be defined by a user of the origin computing device 104 and implemented that restricts all but three of the users from real-time video conferencing via the #TOPIC session.
  • Content associated with the #TOPIC session can be made available to all seven of the users.
  • Materials associated with a respective session can be stored (e.g., backed up) remotely, e.g., in the "cloud" and be available for access, archived and/or made available for users in the future. Such control can, be restricted from future access, as well.
  • communication sessions including interactive video among respective individuals and parties.
  • a plurality of parties can connect to a respective videoconferencing session in accordance with the present application.
  • a client/server model can be effective for handling the respective devices and management of data and communications thereof.
  • one or a few individuals connect to a respective videoconferencing session in accordance with the present application.
  • a P2P framework can be effective in an implementation.
  • FIG. 4 illustrates an example implementation showing an alternative arrangement 400 in which a plurality of users (1-n) operating user computing devices 104 are connected to the server (data processing apparatus 102) and sending individual video feeds and receiving back the composite video feed illustrating users 1-n.
  • the composed HD video stream that comprises the users' video streams can be disassembled by each of the respective user computing devices 104 that receive the composed video stream into individual elements, for example, each representing a respective user participating in the video conference. Other individual elements representing various users can be disassembled, as well.
  • the received video from the data processing apparatus 102 can be received by each respective client device 104, and then dissembled into independent portions.
  • a video matrix of 50 users is transmitted to each of the 50 client devices 104, and each device 104 dissembles the matrix into 50 individual and portions that can be manipulated, such as dragged around the display of the client device, resized, closed, or the like.
  • Each respective user can enjoy a custom and individualized layout of each of the respective feeds from the respective devices, as the disassembled portions can be remapped, such as on a user-defined or pre-defined grid.
  • the data processing apparatus 102 receives 51 streams of video, composes the streams into a canvas and outputs one frame (e.g., HD) at that moment of time.
  • a participant when a participant speaks or otherwise provides audio content in the conference, that user's device 104 may experience an echo, when that user's device 104 plays the audio.
  • the present application may be configured such that the client devices 104 are muted.
  • a limited number of participants may be permitted to contribute audio (e.g., speak) at a time. For example, three devices 104 of fifty devices 104 are configured to enable audio input, while the rest of the devices 104 are muted.
  • each speaking user's device 104 may be configured to silence the audio output originating from that user's device 104 to eliminate any echo.
  • the present application supports a form of audio session
  • a 4 th device 104 inputs audio that is distributed with the composited video provided by the server 102, one of the other three devices 104 is "bounced" and audio input for that device may be removed.
  • Various forms of audio moderation and/or control options are supported for the client devices 104. For example, a first-in-first-out arrangement is supported, such that as one device 104 is added for speech, the device 104 having the currently longest speech capability is removed.
  • auto-detect functionality is supported in which audio capability for devices 104 is automatically turned on and off as vocal input is detected.
  • a moderator assigns rights to the various devices for speech input.
  • Figs. 5A-5G illustrate an example implementation and show display screens 500, 510, 520, 530, 540, 550 and 560, representing an implementation of the present application in which a host user presenter is selecting content for recording interactive video content and providing an interactive conference in accordance with an example implementation.
  • a host user presenter is selecting content for recording interactive video content and providing an interactive conference in accordance with an example implementation.
  • a user can establish a video conference, record video content, share the content and interact with the users during the video conference.
  • Fig. 5H illustrates an example display screen 570 of a video conference provided on a client device 104 in accordance with an example implementation.
  • the video composed by the server 102 comprising the plurality of video feeds from the respective client devices 104 has been disassembled by the respective client device 102, and the various video feeds of the participants has been arranged according to the user's preference.
  • the relative positions and formats of the disassembled users' video feeds can be predefined by the user or in other default configuration, and/or can be manipulated for a particular video conference session. The respective manipulations can be saved and used in future video conference sessions to position/format the respective feeds in accordance with a previous session.
  • the MOBILE APP can configure a user computing device 104 to provide graphical screen controls that, when selected by a user, result in one or more instructions being executed by a processor associated with a user computing device 104 to generate and send an invitation to one or more other user(s).
  • An example interface including a graphical screen control 602 that is selectable for transmitting such an invitation to one or more second user computing devices 104 is illustrated in Fig. 6.
  • the invitation may be via the Internet media extender 110 (e.g., APPLE TV), user computing device 104 or other device, depending upon a respective implementation and/or context of use.
  • a user engaged in an interactive conference via the Internet media extender 110 and television 112 may use a remote control device in connection with making selections and issuing instructions, in one or more implementations.
  • An example virtual pointing device is shown and described in commonly assigned U.S. provisional patent application 62/105,682, filed on January 20, 2015, the entire contents of which are incorporated by reference as if set forth in its entirety, herein.
  • An invitation provided in accordance with the present application can include a hyperlink to a uniform resource locator ("URL") and can be sent via SMS, email, provided in the MOBILE APP, copied in memory, such as to a computing device "clipboard” or provided on an Internet web page, or distributed in other suitable ways Thereafter, at least one other user selects the hyperlink, such as via a mobile user computing device 104 (e.g., a smartphone, tablet or other computing device) that is configured by executing the MOBILE APP, and the user(s) is automatically connected via the communication session defined by the origin computing device 104.
  • a mobile user computing device 104 e.g., a smartphone, tablet or other computing device
  • a prompt to install the MOBILE APP on the device 104 can be provided.
  • the user computing device 104 can be routed to a site (e.g., APP STORE or ANDROID MARKET) to download/install the MOBILE APP.
  • a site e.g., APP STORE or ANDROID MARKET
  • the user can use the link or otherwise connect to the origin device's session.
  • the user is operating a mobile user computing device 104 that is not configured by executing the MOBILE APP, for example, when the user computing device 104 is a laptop computer, a desktop computer, or a network-based computer (e.g., a CHROMEBOOK), then the user can be prompted to launch a web browser software application (e.g., CHROME or SAFARI) and the user computing device 104 can connect to the session, for example via a hyperlink.
  • a web browser software application e.g., CHROME or SAFARI
  • hyperlinks to respective sessions can be persistent, and accordingly be repeatedly used for connection.
  • one or more names can be used to define and refer to a session, and shared in such ways that users can simply select the same link, select the hyperlink as provided in the web browser' s history or in some other consistent manner to reconnect to a session and/or access content/information associated with a respective session.
  • graphical screen controls section 702 is provided that includes controls for adding sharable content, muting a microphone and recording a session. It is to be appreciated that section 702 and the corresponding controls therein are exemplary and that many other options can be provided in section 702 (see, for example, controls 1002 in Fig. 10).
  • a business and/or enterprise implementation can integrate an Internet media extender and television to provide a video conferencing and/or data collaboration solution.
  • a group of people wish to start a data and video conference collaboration session. Some of the people are located in a conference room that includes a TV and a connected Internet media extender (e.g., APPLE TV). Others of the people are in various remote geographic locations, with some using mobile computing devices (e.g., smartphones or tablet computers) and some using desktop/laptop computers.
  • a user associated with the origin host starts the session via the Internet media extender via the TV APP and/or a mobile computing device 104 and via the MOBILE APP, which have been downloaded and installed on at least some of the respective devices 104 and 110 in the conference room and operated by the participants.
  • an initial configuration which may occur the first time the MOBILE APP and/or TV APP is launched, audio/video content from one or more of the computing devices 104 is provided instantly on the television 112 via the Internet media extender 110.
  • an invitation can be transmitted, e.g., via the Internet media extender or mobile computing device 104, to one or more other users' computing devices 104.
  • the invitation can include a hyperlink to a URL that can be sent via SMS, email, provided in the MOBILE APP or on an Internet web page. Users of computing devices 104 select the link, which configures the user computing device(s) 104 to launch the MOBILE APP and join the session automatically.
  • the present application supports integration of multiple cameras and microphones that can be connected remotely to an Internet media extender, such as APPLE TV.
  • a plurality of mobile computing devices 104 e.g., iPhone/iPad/laptop
  • This is similar, in practice, to a "TouchCast" studio multi-camera setup, which allows multiple cameras to feed into an authoring tool.
  • a description of such an authoring tool is shown and described in greater detail in commonly assigned U.S. Patent No. 9,363,448, issued June 7, 2016. Supporting live audio/video feed by multiple cameras provides an advantage and technological benefit for multiple people located in the same room and/or remotely located to utilize their respective mobile devices.
  • audio detection mechanisms can be employed such that when a user speaks, feed from the microphone and/or camera on that user's respective device is provided on audio output (e.g., speakers) associated with the television 112 (via, for example, the Internet media extender 110), as well as on connected computing devices 104 operated by people remotely located (i.e., not in the local setting).
  • audio output e.g., speakers
  • connected computing devices 104 operated by people remotely located (i.e., not in the local setting).
  • cameras associated with the connected computing devices 104 can be "cut to" via one of several ways.
  • the host user can make selections to switch input from various cameras/devices.
  • the MOBILE APP configures the respective computing devices 104 with automatic speaker detection, which operates to detect when a user is speaking and input from that user's respective
  • a user proactively takes control to have audio/video feed from his or her user computing device 104, which can be effected by simply tapping on the screen of the user's computing device 104, to make that user's device 104 provide the primary feed, and can be presented to the other user computing devices 104 in the session.
  • the present application supports video compositing on the TV APP and/or the MOBILE APP, which can include sharing of content between a plurality of devices.
  • the present application supports multiple configurations in which multiple users are presented during the same session. For example, a primary presenter is viewable on all the devices, and everyone else connected to the session views the primary presenter and/or content the primary presenter shares.
  • co- viewing is supported, in which a plurality (e.g., all) members of the session are visible on the screen at the same time.
  • live co-viewing of content viewed by participants is preferably synchronized.
  • content can include YouTube videos, and a group of viewers watch videos together and are in sync as they do so, allowing participants to control the playlist of upcoming content, effectively collaboratively creating a playlist.
  • Another example can include video streams that are live from an external feed.
  • Other examples include viewing a website, and reviewing images and videos from the users' mobile devices 104 (via same stream from device 104 to the Internet media extender 110).
  • a hybrid mode is supported in which all participants can be presented, and one or more instructions executing on a computing device 104/data processing apparatus 102 implement an algorithm that governs which respective user computing device 104 is featured, making some device(s) more prominent as a function of current activity (e.g., someone speaking).
  • multi-camera coverage is supported for a location (e.g., a conference room) which is configured by a primary camera positioned at or with the television 112 that is coupled to the Internet media extender 110, and that includes participants' user computing devices 104. In operation, content can cut between the full view of the room and the individual participants when they speak.
  • FIGs. 9A-9D illustrate examples of sharing content between devices 104, in accordance with an example implementation of the present application.
  • Examples in Figs. 9A-9D identify respective control over portions of content of interactive display screens of user computing devices 104, including as used to provide content on a plurality of devices 104, as well as for individual control over a respective user computing device 104.
  • individual portions can be sized, moved, or otherwise transformed in a respective user's user computing device 104, without affecting other devices 104.
  • Each of a plurality of users e.g., USER 1 and USER 2
  • the users are represented in video that is captured by cameras configured with the respective users' computing devices, and that is displayed substantially in real-time.
  • Fig. 10 illustrates an example set of applications and controls 1002 available via the MOBILE APP and operating on user computing device 104 during an interactive video conferencing session, in accordance with the present application.
  • controls 1002 options for drawing, and providing content from online sources (e.g. webpages, Wikipedia, news sources, maps, Twitter or the like), as well as controls for drawing tools, apps (e.g., calculator, image editing/controls and audio/video feed).
  • apps e.g., calculator, image editing/controls and audio/video feed.
  • Other activities can be integrated, such set forth in vApps, and in the context of communications.
  • games such as Tic/Tac/Toe, Chess, Backgammon, puzzle games, draw together, trivia, cards and table game.
  • Other games can include action games, such as first person shooter games.
  • virtual adventure and exploration is supported, such as GOOGLE MAPS and 360 views, and photo galleries (e.g., FLICKR, GOOGLE IMAGES), shopping galleries, planning tools (e.g., for travel, entertaining, ordering food, etc.), and various professional consultation is supported, such as telemedicine and other professional consultations.
  • photo galleries e.g., FLICKR, GOOGLE IMAGES
  • shopping galleries e.g., for travel, entertaining, ordering food, etc.
  • planning tools e.g., for travel, entertaining, ordering food, etc.
  • various professional consultation is supported, such as telemedicine and other professional consultations.
  • collaboration and collaborative tools include a collaborative whiteboard, document creation and editing, mood board collaboration, general brainstorming, reviewing comps, training and instruction (with real-time assessment), focus group input, and can include real-time voting, sharing, collective web site and/or web application use that includes (but is not limited to) documents, spreadsheets, content management, project management, calendar planning or the like.
  • the present application provides systems and methods for participating video conferencing and sharing that can be layered with interactive content, including content that is available over one or more data communication networks, such as the Internet. Further, one or more software applications provide a viewing/interactive tool for end-users who receive videos that are authored in accordance with the present application, including in an on-line interactive conference.
  • a client interface (which may be configured in a MOBILE APP, or provided via a standard web browser application or other suitable platform), users may interact with videos as a function of touch and gestures, as well as other suitable interfaces, such as provided via an application that configures a smartphone or other mobile computing device to simulate a laser pointer, drawing tool, mouse, trackball, keyboard or other input device.
  • a processor configured with code processes information representing a selection event that occurred in the display unit. For example, a user makes a selection in a remote control software application operating on his or her mobile computing device (e.g., iPhone) in a portion of the display unit while the interactive media content in the display unit is provided therein.
  • the processing that occurs can be to determine at least a relative time and location of the selection event that occurred in the second portion of the display.
  • the information representing the selection event can be stored in one or more databases that are accessible to at least one computing device.
  • the selection of an item can be processed to enable the interaction with at least a portion of the interactive media content at one of the remote devices associated with the selection event. This enables results of a respective interaction associated with the selection event to be viewable or otherwise provided at one particular remote device, but not viewable or otherwise provided at other of the remote devices.
  • Other functionality includes drawing and selection features, which is effective for explaining and highlighting features within a coordinated presentation, including during an interactive video conferencing session.
  • the present application supports providing virtual representations of respective users that are interfacing via a respective video conferencing session in accordance with the teachings herein.
  • one or more modules can be implemented to simulate the viewpoint for each of the respective users that one particular user is located "behind" the display screen configured with, or otherwise accessible by, user computing device 104.
  • a partial representation of a person can similarly be represented and displayed in dynamic ways to represent movement, such as in the context of an interactive video conferencing session.
  • Such functionality can be provided in a particularly configured vAPP.
  • An example is set forth in Figs. 1 lA-1 IE.
  • Fig. 11 A illustrates an example video conferencing session displayed on a user computing device 104, in which the session is between two users (USER 1 and USER 2).
  • writing virtual control e.g., icon
  • writing virtual control 1101 is included that, when selected, causes computing device 104 and/or data processing apparatus 102 to execute one or more instructions to enable a unique interactive environment which simulates the appearance of users.
  • the simulation effect can occur dynamically as a user interacts with user computing device 104, such as by drawing or writing on the screen, using his or her finger, an electronic passive or active stylus or other suitably configured writing device.
  • user computing device 104 Upon a determination by computing device 104 that a user is interacting accordingly, one or more processes can operate to detect and capture specific information associated with the interaction, such as that a user is drawing on his or her display with a stylus and the specific location where the stylus is in contact with the display.
  • the computing device 104 and/or 102 can alter the video representation of an interacting user (e.g., USER 1) on one or more of the computing devices operated by the respective other users during the interactive video conferencing session.
  • an interacting user e.g., USER 1
  • FIG. 1 IB An example is shown in a display of a computing device operated by USER 2 is shown, and in which the video representation of USER 1 has been enlarged and shaded as a function of one or more filter effects (e.g. shading and size).
  • a graphical representation of a human hand 1102 has been overlaid on the display of user computing device 104.
  • the hand 1102 is represented to be holding a writing instrument, such as a stylus (e.g., APPLE PENCIL). Furthermore, the writing instrument is configured to appear to be making contact with the surface of the display at the specific point 1104 at which USER 1 is interacting.
  • the altered representation of USER 1 can be provided on all of the computing devices of the interactive video conferencing session, including the computing device operated by USER 1. Alternatively, the computing device operated by USER 1 is not affected while USER 1 interacts, such as by drawing on his or her display screen.
  • USER 1 appears to be positioned on the other side or within the computing device 104 operated by another user (e.g., USER 2).
  • This can be implemented using one or more image and/or video editing techniques. For example, adjustments can be made to the video content to implement transparency, shading, size or other suitable effects, and one or more layer masks can be used to selectively apply the adjustment.
  • a video composite can be generated that dynamically changes (e.g., moves or changes positions) in accordance with respective user interaction and or activity.
  • Figs. 1 lB-1 ID illustrate video content showing USER 1 in an enlarged and shaded format, which occurs as a function of USER 1 interacting by writing on the display screen of his user computing device 104.
  • the video showing USER 1 Upon lifting his writing instrument away from the display screen, thereby ending his respective edits, the video showing USER 1 returns to its previous format (Fig. HE).
  • the user's writing or drawing can remain displayed on each of the other users' computing devices.
  • the present application provides for a realistic and interactive experience for users of an interactive video conferencing session.
  • a user e.g., USER 1
  • computing device 104 and/or 102 detects and monitors the activity substantially in real-time.
  • a portion of the display can be mapped as a series of coordinates and/or pixels, and the respective location of each interaction can be determined and used for setting a corresponding location on each user's computing device during an interactive video conferencing session.
  • a virtual hand 1102 can be generated and displayed on each of the other users' computing devices, with the specific point 1104 of the stylus represented as being held by the virtual hand 1102 positioned at the same coordinates.
  • the virtual hand 1102 moves and the specific point 1104 of the stylus moves to the same location(s).
  • the hand 1102 can be configured to turn and rotate, such as particular locations within the display screen, in order to provide a more realistic and accurate representation of someone writing or drawing.
  • NURBs Non-Uniform Rational Basis Splines
  • NURBs Non-Uniform Rational Basis Splines
  • Fig. 12A illustrates a display screen of a computing device operated by USER 2, and shows USER 1 drawing a landscape that includes mountains and a road. USER 1 is drawing the landscape to show USER 2 a type of photographic image that USER 1 would like to use for commercial purposes.
  • USER 2 selects an icon to browse for specific content, such as images that are stored on the computing device operated by USER 2. After locating a suitable image that is locally stored and USER's 2 computing device, the image is inserted and shown to USER 1 (Fig. 12B).
  • FIGs. 13A-13B illustrate example implementations of the present application from the perspective of a user (USER 1) drawing on a display.
  • USER 1 is preparing to draw on his display using a stylus, but has not yet made contact with the display.
  • Fig. 13B USER 1 has caused his stylus to make contact with the display and is drawing.
  • the enlarged and shaded representation of USER 1 is displayed on the computing device operated by USER 1.
  • Fig. 14 illustrates a flow diagram of steps SI 00 in connection with one example implementation of the present application.
  • a first user computing device 104 that represent a session (e.g., #SALES) (step S102).
  • a network session is defined (step S104).
  • access is provided to the first user computing device 104 (step S106).
  • the network session access can include, for example, functionality that is provided for presenter or origin devices, which can include additional rights than provided to participant or non-presenter devices.
  • information associated with additional computing device(s) 104 is received (step S108).
  • different access to the network session is provided for the additional computing device(s) (step 110).
  • step SI 12 audio and/or video (“audio/video”) input is received from each of the respective devices 104.
  • audio/video For example, feed from a camera and/or microphone that is operatively connected to each computing device 104 is received by data processing apparatus 102.
  • a composited audio/video feed is generated using the audio/video input (step SI 14), and the composited audio/video feed is transmitted to each of the devices 104 (step SI 16).
  • Each of the first user computing device 104 and the additional user computing devices 104 is configured via a gesture-based user interface to enable user-based selection and respective manipulation of at least a portion of the composited audio/video feed that results in a customized representation of the composited audio- video feed at a particular one of the at least one additional user computing device that is not represented at a particular other of the at least one additional user computing device. Thereafter, the process ends (step SI 18).
  • the present application can further include a feature that allows a user to simulate a laser pointer effect when presenting embedded content in a coordinated presentation.
  • the on-board gyroscope provided with a mobile computing device, such as a smartphone, can be employed to locate where laser pointer functionality effectively should appear in a coordinated presentation.
  • a mobile computing device such as a smartphone
  • vApps substantially as shown and described in co-pending and commonly assigned Patent
  • Application Serial Number 14/316,536) can be interacted with as a function of the mobile computing device simulating a laser pointer.
  • the user can perform various laser-pointer functionality, such as to navigate between slides, point to or identify respective portions of a display including in a coordinated presentation, or perform other automated tasks.
  • the pointer can be moved around the screen as a function of the gyroscope within the mobile computing device.
  • a mobile computing device that is configured with the MOBILE APP is configured to simulate the remote control, laser device and/or annotation and drawing tool.
  • various vApps can be included in a respective interactive video conferencing session and can be manipulated (including pointed at) by a user using the remote control. This enables a user to be decoupled from the respective authoring platform, such as the iPad or other tablet computer, by using a simulated remote control, for example, across the room.
  • content that is curated e.g., queued for inclusion in a video conferencing session, can be selected simply by pressing an option in a graphical user interface provided on the mobile computing device operating as the remote control.
  • Interactive elements e.g., vApps can be embedded in the conference session simply by the author selecting an option to include the element(s) in the remote control software app.
  • the author can add and/or remove interactive elements by selecting options in the remote control app operating on the mobile computing device, which effectively frees the author from being tied to the authoring tool (e.g., the iPad).
  • users or participants of a conferencing session of the present application can use a remote control, pointing and annotation functionality, including on their respective mobile computing devices (e.g., smartphones) to interact with coordinated presentations.
  • a viewer of a coordinated presentation can select a vApp during playback to be presented with the content therein, and thereafter highlight, draw on or perform other interactive functionality using the simulated remote control.
  • a virtual pointer, and annotating tool or other selection tool in a coordinated presentation can be remotely controlled.
  • a smartphone or other mobile computing devices effectively doubles as a mouse, selection device, drawing tool or other interactive device.
  • the mobile computing device and coordinated presentation authoring/playback device preferably communicate over Wi-Fi.
  • the remote can ask the "master" device via Wi-Fi or other protocol, such as Bluetooth, for permission to connect therewith.
  • the telematics of the mobile computing device such as an accelerometer and/or gyroscope, is employed over a digital IP connection to transmit to the presentation authoring and/or playback software, which in turn functions to control a simulated laser red dot, drawing tool or other functionality, which can be configured as a core function of the presentation authoring and/or playback application.
  • the presentation authoring and/or playback software which in turn functions to control a simulated laser red dot, drawing tool or other functionality, which can be configured as a core function of the presentation authoring and/or playback application.
  • a new form of remote control is provided by pairing a mobile computing device, such as a smartphone, with an authoring tool device, a display device, or other presentation device.
  • Various functionality provided in connection with coordinated presentations such as controlling teleprompters (e.g., speed), green screen functionality, whiteboard functionality, titles, interactive Web-based content, or the like, can be selected and controlled vis-a-vis the remote control in connection with the present application.
  • the present application further provides for simulating laser pointer control, annotation and drawing devices, such as for controlling publicly viewed or accessible displays.
  • the present application is also useful for public speaking or other public presentation activities.
  • the present application supports simulation of a laser pointer, annotation and drawing tool.
  • the present application provides a simple to use, yet powerful interactive remote video conferencing platform that incorporates a plurality of computing devices, e.g., smartphones, tablets, laptops and desktops, and enables live, realtime sharing and conferencing.
  • One or more televisions 112 can be implemented in the present application via an Internet media extender 110, and content can be provided from a plurality of remote sources, such as cameras and/or microphones configured with user computing devices 104 that are located remotely and communicating over the Internet.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Engineering & Computer Science (AREA)
  • Telephonic Communication Services (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

A networked communication session is defined vis-à-vis parameter information received from a first device, and access to the networked communication session is provided to the first device. Access to the networked communication session is further provided to each of additional device(s) in accordance with information associated with each of the additional device(s). A composited interactive audio/video feed is generated and transmitted that comprises audio/video input received during the networked communication session from each of the devices. Each of the devices is configured via a gesture-based user interface to enable user-based selection and respective manipulation of at least a portion of the composited audio/video feed that results in a customized representation of the composited audio-video feed at a particular one device that is not represented at a particular other of the device(s).

Description

System and Method for Interactive Video Conferencing
Cross-Reference To Related Applications
[0001] This application is based on and claims priority to U.S. Provisional Patent Application No. 62/209,727, filed August 25, 2015, U.S. Provisional Patent Application No. 62/242,029, filed October 15, 2015 and U.S. Provisional Patent Application No. 62/329,081, filed April 28, 2016, the contents of each of which are incorporated by reference in their respective entireties, as if set forth herein. Field
[0002] The present application relates, generally, to content presentation and, more particularly, to a system and method for providing and interacting with content via interactive communication sessions. Background
[0003] Interactive and supplemental content that has been made available to viewers has been done through a decoupled, separate communication channel. For instance, a producer can provide a separate communication channel with data, a video stream, or both at a URL associated with the broadcast. For example, a television station can have on-air programming and also provide supplemental content available through a website. Apart from sponsoring both sources of information, these communication channels are generally decoupled from one another. In other words, the broadcaster has only an indirect relationship to the viewer with regard to any supplemental content.
[0004] One of the biggest missed opportunities of the entire category of smart TV connected devices is the absence of easy to deploy video conferencing. As millions of consumers place these devices in their living rooms the core use case of facilitating communications via the large screen TV is missing.
[0005] It is with respect to these and other considerations that the disclosure made herein is presented.
Brief Summary
[0006] In one or more implementations of the present application, a networked communication session is defined vis-a-vis parameter information received from a first device, and access to the networked communication session is provided to the first device. Access to the networked communication session can be further provided to each of additional device(s) in accordance with information associated with each of the additional device(s). A composited interactive audio/video feed is generated and transmitted that comprises audio/video input received during the networked communication session from each of the devices. Each of the devices is configured via a gesture-based user interface to enable user- based selection and respective manipulation of at least a portion of the composited audio/video feed that results in a customized representation of the composited audio- video feed at a particular one device that is not represented at a particular other of the device(s). In one or more implementations, at least one of the additional user computing device(s) is configured to present the composited audio/video feed on a display via an Internet media extender, wherein the display can be a television.
[0007] In one or more implementations, at least one processor receives from the first user computing device, information representing at least one video application ("vApp") that, when executed by the first user computing device and/or the at least one additional user computing device, results in functionality, information and/or content to be provided during the networked communication session. vApp identification information representing the at least one vApp is transmitted is transmitted to at least one additional user computing device, and is usable to configure the at least one additional user computing device to execute the vApp.
[0008] In one or more implementations, the at least one processor receives from the first user computing device and/or the at least one additional user computing device, information representing the at least one other user computing device and providing, by the at least one processor, the respective access to the networked communication session to each of the at least one additional user computing device as a function of the information representing the at least one other user computing device.
[0009] Other features and advantages of the present invention will become apparent from the following description of the invention that refers to the accompanying drawings. Brief Description Of The Drawings
[0010] Aspects of the present disclosure will be more readily appreciated upon review of the detailed description of its various embodiments, described below, when taken in conjunction with the accompanying drawings, of which: [0011] Fig. 1 is a diagram illustrating an example hardware arrangement that operates for providing the systems and methods disclosed herein;
[0012] Fig. 2 is a block diagram that illustrates functional elements of a computing device in accordance with an embodiment;
[0013] Fig. 3 is an example high-level diagram that illustrates interactivity between various ones of the devices illustrated in Fig. 1 ;
[0014] Fig. 4 illustrates an example implantation of an alternate arrangement;
[0015] 5A-5H illustrate an example implementation and show display screens representing an implementation of the present application;
[0016] Fig. 6 illustrates a portion of a display screen that includes a graphical screen control that is selectable for transmitting an invitation to one or more user(s) to join an interactive conferencing session in accordance with an example implementation of the present application;
[0017] Fig. 7 illustrates a portion of a display screen that includes graphical screen controls for adding sharable content, muting a microphone and recording a session in accordance with an example implementation of the present application;
[0018] Fig 8 illustrates an example high-level diagram of an enterprise implementation, including a plurality of user computing devices, an Internet media extender and a television;
[0019] Figs. 9A - 9D illustrate examples of sharing content between devices, in accordance with an example implementation of the present application;
[0020] Fig. 10 illustrates an example set of applications and controls available via a mobile application and operating on user computing device during an interactive video conferencing session, in accordance with an example implementation of the present application;
[0021] Figs. 11 A-l IE illustrate examples of sharing content between devices, in accordance with an example implementation of the present application;
[0022] Figs. 12A-12B illustrate examples of sharing content between devices, in accordance with an example implementation of the present application;
[0023] Figs. 13A-13B illustrate an example user computing device configured and operational in accordance with the present application; and
[0024] Fig. 14 is an example flow diagram illustrating example steps associated with an implementation of the present application. Detailed Description
[0025] By way of introduction and overview, in one or more implementations the present application provides systems and methods for providing interactive video conferencing over one or more data communication networks, such as the Internet. Devices operating, for example, iOS, ANDROID, WINDOWS MOBILE, BLACKBERRY, MAC OS, WINDOWS or other operating systems are configured with one or more software applications that provide functionality, such as with an interface for developing ("authoring") distributable coordinated presentations. Such presentations can include interactive video having customizable and interactive functionality for and between devices with a plurality of end-users who receive the video. Further, the one or more software applications configure a user computing device with a viewing/interactive tool, referred to herein, generally, as a "consuming" interface for end-users who receive interactive video that are authored in accordance with the present application and usable for end-users to communicate (e.g., via interactive video conferencing functionality). Using the client interface, users may interact with each other and share interactive videos and other content as a function of touch and gestures, as well as graphical screen controls that, when selected, cause a computing device to execute one or more instructions and effect various functionality. For example, a smartphone or other mobile computing device can be configured via one or more applications in accordance with the ability to simulate a laser pointer, drawing tool, mouse, trackball, keyboard or other input device.
[0026] In accordance with the teachings herein, implementations of the present application provide a simple to use, informing and entertaining communications experience that incorporates content from a plurality of computing devices, e.g., smartphones, tablets, laptops and desktops, and enables live sharing in a real-time and conferencing capability therefore. In one or more implementations, one or more televisions can be used for respective audio/visual display devices, and can provide feed from cameras and/or microphones configured with various local and/or remotely located computing devices that are communicating over data communication networks such as the Internet. A television can be implemented in the present application in various ways, such as via an Internet media extender provided by APPLE TV, ROKU, AMAZON FIRE TV or GOOGLE
CHROMECAST. As used herein, an Internet media extender refers, generally, to a category of devices that provide for content to be streamed to a television, surround sound devices, and the like. Unlike functionality provided by known Internet media extenders, however, the present application facilitates integrating audio/video input capabilities of computing devices (e.g., microphones, cameras and software that drive and enhance audio/visual captures) into video-conferencing capabilities. The present application facilitates one or more of: one-to- one (1 : 1) video conferencing; group video conferencing; sharing and/or viewing of content provided on a plurality of computing devices, and interactive computing activities.
[0027] In one or more implementations, content, which can be formatted as and/or include images, audio/video content, website content, computer programs and/or content provided in various formats (collectively referred to herein, generally, as "vApps"), can be implemented vis-a-vis one or more mobile software applications. vApp icons can be provided that represent vApps that are included with the conferencing sessions. In accordance with one or more implementations, after a respective icon is selected by a user, the user can interact with the respective vApp. Functionality, information and/or content can be associated with the vApp and provided in a shared conferencing session, which is made available to user computing devices connected thereto.
[0028] Thus, in one or more implementations, the present application provides for interactive video conferencing that integrates audio/video input and output from individual mobile computing devices (e.g., smartphones and tablet computers) with Internet media extender devices (e.g., APPLE TV). By leveraging technology configured with mobile computing devices, e.g., cameras and microphones, the present application provides a new form of live and interactive functionality that can make a person's living room or other residential viewing area into a high-end video conferencing suite. Non-residential implementations are supported, as well, as shown and described in greater detail herein.
[0029] In accordance with the present application, online collaborative services are provided, for example, including for webinars, webcasts, and meetings. In one or more implementations, Internet technologies such as TCP/IP connectivity support web conferencing services, including sharing of audio, video, textual and various forms of multimedia content.
[0030] Referring to Fig. 1 , a diagram is provided that shows an example hardware arrangement that operates for providing the systems and methods disclosed herein, and designated generally as system 100. System 100 can include one or more data processing apparatuses 102 that are at least communicatively coupled to one or more user computing devices 104 across communication network 106. Data processing apparatuses 102 and user computing devices 104 can include, for example, mobile computing devices such as tablet computing devices, smartphones, personal digital assistants or the like, as well as laptop computers and/or desktop computers. Further, one computing device may be configured as a data processing apparatus 102 and a user computing device 104, depending upon operations be executed at a particular time. In addition, an audio/visual capture device 105 is depicted in Fig. 1, which can be configured with one or more cameras (e.g., front-facing and rear-facing cameras), a microphone, a microprocessor, and a communications module(s) and that is coupled to data processing apparatus 102. The audio/visual capture device 105 can be configured to interface with one or more data processing apparatuses 102 for producing high quality and interactive multimedia content, and supporting interactive video conferencing.
[0031] With continued reference to Fig. 1, data processing apparatus 102 can be configured to access one or more databases for the present application, including image files, video content, documents, audio/video recordings, metadata and other information.
However, it is contemplated that data processing apparatus 102 can access any required databases via communication network 106 or any other communication network to which data processing apparatus 102 has access. Data processing apparatus 102 can communicate with devices comprising databases using any known communication method, including a direct serial, parallel, universal serial bus ("USB") interface, or via a local or wide area network.
[0032] User computing devices 104 communicate with data processing apparatuses 102 using data connections 108, which are respectively coupled to communication network 106. Communication network 106 can be any communication network, but is typically the Internet or some other global computer network. Data connections 108 can be any known arrangement for accessing communication network 106, such as the public internet, private Internet (e.g. VPN), dedicated Internet connection, or dial-up serial line interface
protocol/point-to-point protocol (SLIPP/PPP), integrated services digital network (ISDN), dedicated leased-line service, broadband (cable) access, frame relay, digital subscriber line (DSL), asynchronous transfer mode (ATM) or other access techniques.
[0033] User computing devices 104 preferably have the ability to send and receive data across communication network 106, and are equipped with web browsers, software applications, or other means, to provide received data on display devices incorporated therewith. By way of example, user computing device 104 may be personal computers such as Intel Pentium-class and Intel Core-class computers or Apple Macintosh computers, tablets, smartphones, but are not limited to such computers. Other computing devices which can communicate over a global computer network such as palmtop computers, personal digital assistants (PDAs) and mass-marketed Internet access devices such as WebTV can be used. In addition, the hardware arrangement of the present invention is not limited to devices that are physically wired to communication network 106, and that wireless communication can be provided between wireless devices and data processing apparatuses 102. In addition, system 100 can include Internet media extender 110 that is communicatively coupled to television 112, such as via a high-definition multimedia interface ("HDMI") or other connection.
[0034] According to an embodiment of the present application, user computing device 104 provides user access to data processing apparatus 102 for the purpose of receiving and providing information. The specific functionality provided by system 100, and in particular data processing apparatuses 102, is described in detail below.
[0035] System 100 preferably includes software that provides functionality described in greater detail herein, and preferably resides on one or more data processing apparatuses 102 and/or user computing devices 104. One of the functions performed by data processing apparatus 102 is that of operating as a web server and/or a web site host. Data processing apparatuses 102 typically communicate with communication network 106 across a permanent i.e., un-switched data connection 108. Permanent connectivity ensures that access to data processing apparatuses 102 is always available.
[0036] Fig. 2 illustrates, in block diagram form, an exemplary data processing apparatus 102 and/or user computing device 104 that can provide various functionality, as shown and described herein. Although not expressly indicated, one or more features shown and described with reference with Fig. 2 can be included with or in the audio/visual capture device 105, as well. Data processing apparatus 102 and/or user computing device 104 may include one or more microprocessors 205 and connected system components (e.g., multiple connected chips) or the data processing apparatus 102 and/or user computing device 104 may be a system on a chip.
[0037] The data processing apparatus 102 and/or user computing device 104 includes memory 210 which is coupled to the microprocessor(s) 205. The memory 210 may be used for storing data, metadata, and programs for execution by the microprocessor(s) 205. The memory 210 may include one or more of volatile and non- volatile memories, such as Random Access Memory ("RAM"), Read Only Memory ("ROM"), Flash, Phase Change Memory ("PCM"), or other type. The data processing apparatus 102 and/or user computing device 104 also includes an audio input/output subsystem 215 which may include one or more microphones and/or speakers.
[0038] A display controller and display device 220 provides a visual user interface for the user; this user interface may include a graphical user interface which, for example, is similar to that shown on a Macintosh computer when running Mac OS operating system software or an iPad, iPhone, or similar device when running iOS operating system software.
[0039] The data processing apparatus 102 and/or user computing device 104 also includes one or more wireless transceivers 230, such as an IEEE 802.11 transceiver, an infrared transceiver, a Bluetooth transceiver, a wireless cellular telephony transceiver (e.g., 1G, 2G, 3G, 4G), or another wireless protocol to connect the data processing system 100 with another device, external component, or a network. In addition, Gyroscope/ Accelerometer 235 can be provided.
[0040] It will be appreciated that one or more buses, may be used to interconnect the various modules in the block diagram shown in Fig. 2.
[0041] The data processing apparatus 102 and/or user computing device 104 may be a personal computer, tablet-style device, such as an iPad, a personal digital assistant (PDA), a cellular telephone with PDA-like functionality, such as an iPhone, a Wi-Fi based telephone, a handheld computer which includes a cellular telephone, a media player, such as an iPod, an entertainment system, such as a iPod touch, or devices which combine aspects or functions of these devices, such as a media player combined with a PDA and a cellular telephone in one device. In other embodiments, the data processing apparatus 102 and/or user computing device 104 may be a network computer or an embedded processing apparatus within another device or consumer electronic product.
[0042] The data processing apparatus 102 and/or user computing device 104 also includes one or more input or output ("I/O") devices and interfaces 225 which are provided to allow a user to provide input to, receive output from, and otherwise transfer data to and from the system. These I/O devices may include a mouse, keypad or a keyboard, a touch panel or a multi-touch input panel, camera, network interface, modem, other known I/O devices or a combination of such I/O devices. The touch input panel may be a single touch input panel which is activated with a stylus or a finger or a multi-touch input panel which is activated by one finger or a stylus or multiple fingers, and the panel is capable of distinguishing between one or two or three or more touches and is capable of providing inputs derived from those touches to the data processing apparatus 102 and/or user computing device 104. The I/O devices and interfaces 225 may include a connector for a dock or a connector for a USB interface, Fire Wire, etc. to connect the system 100 with another device, external component, or a network. Moreover, the I/O devices and interfaces can include gyroscope and/or accelerometer 227, which can be configured to detect 3-axis angular acceleration around the X, Y and Z axes, enabling precise calculation, for example, of yaw, pitch, and roll. The gyroscope and/or accelerometer 227 can be configured as a sensor that detects acceleration, shake, vibration shock, or fall of a device 102/104, for example, by detecting linear acceleration along one of three axes (X, Y and Z). The gyroscope can work in conjunction with the accelerometer, to provide detailed and precise information about the device's axial movement in space. More particularly, the 3 axes of the gyroscope combined with the 3 axes of the accelerometer enable the device to recognize approximately how far, fast, and in which direction it has moved to generate telemetry information associated therewith, and that is processed to generate coordinated presentations, such as shown and described herein.
[0043] Additional components, not shown, can also be part of the data processing apparatus 102 and/or user computing device 104, and, in certain embodiments, fewer components than that shown in Fig. 2 may also be used in data processing apparatus 102 and/or user computing device 104. It will be apparent from this description that aspects of the inventions may be embodied, at least in part, in software. That is, the computer- implemented methods may be carried out in a computer system or other data processing system in response to its processor or processing system executing sequences of instructions contained in a memory, such as memory 210 or other machine -readable storage medium. The software may further be transmitted or received over a network (not shown) via a network interface device 225. In various embodiments, hardwired circuitry may be used in combination with the software instructions to implement the present embodiments. Thus, the techniques are not limited to any specific combination of hardware circuitry and software, or to any particular source for the instructions executed by the data processing apparatus 102 and/or user computing device 104.
[0044] In one or more implementations, the present application provides improved processing techniques to prevent packet loss, to improve handling interruptions in communications, to reduce or eliminate latency and other issues associated with wireless technology. For example, in one or more implementations Real Time Streaming Protocol (RTSP) can be implemented, for example, for sharing output associated with a camera, microphone and/or other output devices configured with a computing device. RTSP is an effective (though not necessary in all implementations) network control protocol for entertainment and communications systems, including in connection with streaming output. RTSP is used in the present application, at least in part, for establishing and controlling media sessions between various end points, including user computing devise 104, Internet media extender 110 and data processing apparatus 102. [0045] In addition to RTSP, one or more implementations of the present application can be configured to use Web Real-Time Communication ("WebRTC") to support browser-to- browser applications, including in connection with voice, video chat, and peer-to-peer ("P2P") file sharing. Thus, the present application avoids a need for either internal or external plugins to connect endpoints, including for voice/video or other communication sharing. In one or more implementations, the present application implements WebRTC for applications and/or Internet web sites to capture and/or stream audio and/or video media, as well as to exchange data between browsers without requiring an intermediary. The set of standards that comprises WebRTC makes it possible to share data and perform
teleconferencing peer-to-peer, without requiring that the user install plug-ins or any other third-party software. WebRTC includes several interrelated APIs and protocols which work together.
[0046] In one or more implementations, at least one of the Internet media extender components 110 includes APPLE TV. After an Internet media extender 110 is installed (e.g., connected to a television set and connected to a Wi-Fi, Ethernet or other local area network), a software application is installed on the Internet media extender 110, as well as at least one mobile computing device 104. For example, a user downloads and installs an app to an Internet media extender 110 ("TV APP") and also installs an app to a user computing device 104 ("MOBILE APP"). Once installed, and the first time the TV APP is executed, the user is prompted to launch the MOBILE APP. Thereafter, the mobile computing device 104 (e.g., an iPhone) is automatically detected by the TV APP. During subsequent uses, video content that is provided as a function audio/video output from the computing device (e.g., iPhone) is provided instantly on the television that is connected to the Internet media extender 110. In operation, audio/video feed from the iPhone is provided on big screen. The TV APP and the MOBILE APP may be configured as a single application (e.g., distributed as a single application), or may be provided as separate applications.
[0047] In one or more implementations, each of a plurality of participants operating, for example, user computing device 104 participate in an interactive video conference at least in part by establishing a data/communication session with the data processing apparatus 102. A form of a star topology is established, in which data processing apparatus 102 is
communicatively connected to each of a plurality of respective user computing devices 104 and respectfully receives audio/video feed from each device, such as provided as a function of input from a respective camera and/or microphone. [0048] Thus, in one or more implementations, the present application can implement a star topology in which a central node (e.g., a data processing apparatus 102) receives low resolution of video content from each of a plurality of computing devices (e.g., client devices 104). The central node can be configured by executing program instructions to compose a single video comprising all of the video received from the various devices. The single video can be provided substantially in real-time as one high-definition ("HD") video. The central node can send the HD video to all of the computing devices operated by the various users, as well as to the device operated by the "presenter."
[0049] Continuing with the respective one or more implementations described above, each of the respective individual feeds from the respective devices is received by the data processing apparatus 102 and the video feeds (including, for example, images) are composed into a single video stream. The video stream can be configured as a high definition stream (e.g., 1280x720 or higher resolution), and output to each of at least some of the respective user computing devices 104.
[0050] Fig. 3 is an example high-level diagram that illustrates interactivity between various ones of the devices illustrated in Fig. 1 , and identifies example communication protocols in one or more implementations of the present application. The implementation illustrated in Fig. 3 is usable as a consumer (e.g., a residential) implementation, as well as an enterprise implementation. As illustrated in Fig. 3, WebRTC is shown with regard to communications between user computing devices 104 (shown as a CHROME BOOK and mobile computing device, e.g., a smart phone) and supporting browser-to-browser applications and P2P functionality. In addition, RTSP is utilized in connection with user computing devices 104 and Internet media extender 110, thereby enabling presentation of audio/video content from devices 104 on television 112.
[0051] In one or more implementations, HTTP Live Streaming ("HLS") is utilized for HTTP-based media streaming. As known in the art, HLS is a usable to parse a stream into a sequence of small HTTP-based file downloads, each download comprising a portion of the stream. As the stream plays, a client device can select from a number of different alternate streams containing the same material encoded at a variety of data rates, allowing the streaming session to adapt to an available data rate. A M3U playlist containing the metadata for the various sub-streams which are available for download is also provided and downloaded.
[0052] In the example illustrated in Fig. 3, a respective computing device 104 is illustrated as the "origin" (also referred to herein, generally, as a "presenter" device) that is executing the MOBILE APP and defining a session for respective other devices to use for, for example, interactive video conferencing. In one or more implementations, the "origin" device 104 establishes an initial session and options are provided to invite other users (e.g., user computing devices 104 that are configured with the MOBILE APP) to join the conferencing session. Users who are invited to join the session can further invite other users, for example, when permission for extending invitations to other users has been made available (e.g., enabled) by the origin device. Accordingly, the MOBILE APP can include instructions for enabling a processor associated with each respective user computing device 104 to generate and transmit invitations for users to join a respective conferencing session. In accordance with the present application, video sharing, video conferencing, sharing of multimedia content, data, documents and various files is supported, as shown and described in greater detail herein.
[0053] In one or more implementations, a plurality of interactive communication sessions can be defined by an origin device. Each of the respective sessions can be defined and identified using a specific title or label. For example, "#APPDEV" can be used to define and identify an interactive communication session having a topic dedicated to software application development. The origin device can, thereafter, transmit invitations to computing devices 104 associated with software engineers and other relevant parties to join the
#APPDEV session. Videoconferencing technology in accordance with the present application is, thereafter, available for the software engineers and other relevant parties and the user of the origin device 104, such as to confer over topics associated with software application development. Similarly, the same respective origin device 104 can define another respective session with a different topical focus, e.g., sales and entitled #SALES. Invitations to the #SALES session can be transmitted by the origin device to computing devices 104 associated with individuals in a sales and marketing department. Videoconferencing technology in accordance with the present application is, thereafter, available for those one or more individuals to confer about topics associated with sales. In one or more
implementations, at least part of the respective groups of users is mutually exclusive, in that members of the "#APPDEV" session cannot participate in the #SALES session, and at least some of the members of the #SALES session cannot participate in the #APPDEV session.
[0054] In operation, and in accordance with one or more implementations, after an invitation is sent to a user of a computing device 104 for joining a session defined by a respective topic (e.g., #TOPIC), the user affirmatively accepts the invitation and is, thereafter, authorized to join the session. Thereafter, the user can select, via the MOBILE APP, an identifier representing the respect session (e.g., #TOPIC) provided, which causes the user computing device 104 to execute one or more instructions that enable the device 104 to connect to and/or join the session, and access materials associated therewith. Moreover, in one or more implementations rules can be defined and/or enforced that restrict access to sessions and/or content to respective users. For example, a session defined as #TOPIC may be shared by seven users, however rules can be defined by a user of the origin computing device 104 and implemented that restricts all but three of the users from real-time video conferencing via the #TOPIC session. Content associated with the #TOPIC session, however, can be made available to all seven of the users. Materials associated with a respective session can be stored (e.g., backed up) remotely, e.g., in the "cloud" and be available for access, archived and/or made available for users in the future. Such control can, be restricted from future access, as well.
[0055] It will be appreciated by one of ordinary skill that countless sessions can be defined for topics of all sorts and usable for respective groups of users. Demarcating individual sessions in accordance with the present application is usable in the consumer and enterprise markets alike, and solves a long felt need associated with controlled
communication sessions including interactive video among respective individuals and parties.
[0056] In certain implementations, a plurality of parties can connect to a respective videoconferencing session in accordance with the present application. In such instances, a client/server model can be effective for handling the respective devices and management of data and communications thereof. In other certain implementations, one or a few individuals connect to a respective videoconferencing session in accordance with the present application. In such cases, a P2P framework can be effective in an implementation.
[0057] Fig. 4 illustrates an example implementation showing an alternative arrangement 400 in which a plurality of users (1-n) operating user computing devices 104 are connected to the server (data processing apparatus 102) and sending individual video feeds and receiving back the composite video feed illustrating users 1-n.
[0058] In one or more implementations, the composed HD video stream that comprises the users' video streams can be disassembled by each of the respective user computing devices 104 that receive the composed video stream into individual elements, for example, each representing a respective user participating in the video conference. Other individual elements representing various users can be disassembled, as well. Thus, the received video from the data processing apparatus 102 can be received by each respective client device 104, and then dissembled into independent portions. For example, a video matrix of 50 users is transmitted to each of the 50 client devices 104, and each device 104 dissembles the matrix into 50 individual and portions that can be manipulated, such as dragged around the display of the client device, resized, closed, or the like. Each respective user can enjoy a custom and individualized layout of each of the respective feeds from the respective devices, as the disassembled portions can be remapped, such as on a user-defined or pre-defined grid. The data processing apparatus 102, for example, receives 51 streams of video, composes the streams into a canvas and outputs one frame (e.g., HD) at that moment of time.
[0059] In one or more implementations, when a participant speaks or otherwise provides audio content in the conference, that user's device 104 may experience an echo, when that user's device 104 plays the audio. In order to combat cacophony, echo or other undesirable noise interference, which may occur, the present application may be configured such that the client devices 104 are muted. Further, a limited number of participants may be permitted to contribute audio (e.g., speak) at a time. For example, three devices 104 of fifty devices 104 are configured to enable audio input, while the rest of the devices 104 are muted. Further, each speaking user's device 104 may be configured to silence the audio output originating from that user's device 104 to eliminate any echo.
[0060] Furthermore, the present application supports a form of audio session
management. For example, if three devices 104 out of 50 devices 104 are configured for enabling audio input, and then a 4th device 104 inputs audio that is distributed with the composited video provided by the server 102, one of the other three devices 104 is "bounced" and audio input for that device may be removed. Various forms of audio moderation and/or control options are supported for the client devices 104. For example, a first-in-first-out arrangement is supported, such that as one device 104 is added for speech, the device 104 having the currently longest speech capability is removed. In an alternative implementation, auto-detect functionality is supported in which audio capability for devices 104 is automatically turned on and off as vocal input is detected. In yet another implementation, a moderator assigns rights to the various devices for speech input.
[0061] Figs. 5A-5G illustrate an example implementation and show display screens 500, 510, 520, 530, 540, 550 and 560, representing an implementation of the present application in which a host user presenter is selecting content for recording interactive video content and providing an interactive conference in accordance with an example implementation. As can be seen in the example implementation(s) shown in Figs. 4-10, a user can establish a video conference, record video content, share the content and interact with the users during the video conference. [0062] Fig. 5H illustrates an example display screen 570 of a video conference provided on a client device 104 in accordance with an example implementation. As illustrated, the video composed by the server 102 comprising the plurality of video feeds from the respective client devices 104 has been disassembled by the respective client device 102, and the various video feeds of the participants has been arranged according to the user's preference. In one or more implementations, the relative positions and formats of the disassembled users' video feeds can be predefined by the user or in other default configuration, and/or can be manipulated for a particular video conference session. The respective manipulations can be saved and used in future video conference sessions to position/format the respective feeds in accordance with a previous session.
[0063] In one or more implementations, the MOBILE APP can configure a user computing device 104 to provide graphical screen controls that, when selected by a user, result in one or more instructions being executed by a processor associated with a user computing device 104 to generate and send an invitation to one or more other user(s). An example interface including a graphical screen control 602 that is selectable for transmitting such an invitation to one or more second user computing devices 104 is illustrated in Fig. 6. The invitation may be via the Internet media extender 110 (e.g., APPLE TV), user computing device 104 or other device, depending upon a respective implementation and/or context of use. For example, a user engaged in an interactive conference via the Internet media extender 110 and television 112, may use a remote control device in connection with making selections and issuing instructions, in one or more implementations. An example virtual pointing device is shown and described in commonly assigned U.S. provisional patent application 62/105,682, filed on January 20, 2015, the entire contents of which are incorporated by reference as if set forth in its entirety, herein.
[0064] An invitation provided in accordance with the present application can include a hyperlink to a uniform resource locator ("URL") and can be sent via SMS, email, provided in the MOBILE APP, copied in memory, such as to a computing device "clipboard" or provided on an Internet web page, or distributed in other suitable ways Thereafter, at least one other user selects the hyperlink, such as via a mobile user computing device 104 (e.g., a smartphone, tablet or other computing device) that is configured by executing the MOBILE APP, and the user(s) is automatically connected via the communication session defined by the origin computing device 104. In case the MOBILE APP has not been installed on a respective user computing device 104 that received the invitation, then a prompt to install the MOBILE APP on the device 104 can be provided. In one or more implementations, the user computing device 104 can be routed to a site (e.g., APP STORE or ANDROID MARKET) to download/install the MOBILE APP. Moreover, once the MOBILE APP is installed and the user computing device 104 is configured, the user can use the link or otherwise connect to the origin device's session.
[0065] Alternatively, if the user is operating a mobile user computing device 104 that is not configured by executing the MOBILE APP, for example, when the user computing device 104 is a laptop computer, a desktop computer, or a network-based computer (e.g., a CHROMEBOOK), then the user can be prompted to launch a web browser software application (e.g., CHROME or SAFARI) and the user computing device 104 can connect to the session, for example via a hyperlink.
[0066] In one or more implementations, hyperlinks to respective sessions can be persistent, and accordingly be repeatedly used for connection. As noted herein, one or more names can be used to define and refer to a session, and shared in such ways that users can simply select the same link, select the hyperlink as provided in the web browser' s history or in some other consistent manner to reconnect to a session and/or access content/information associated with a respective session.
[0067] In addition to graphical screen controls that are provided for inviting users to a respective session, various other graphical screen controls can be provided to perform various options. For example, and as illustrated in Fig. 7, graphical screen controls section 702 is provided that includes controls for adding sharable content, muting a microphone and recording a session. It is to be appreciated that section 702 and the corresponding controls therein are exemplary and that many other options can be provided in section 702 (see, for example, controls 1002 in Fig. 10).
[0068] In addition to a consumer or residential implementation of the present application, the teachings herein support enterprise use. For example, a business and/or enterprise implementation can integrate an Internet media extender and television to provide a video conferencing and/or data collaboration solution. A block diagram illustrating an example conference room in an enterprise implementation, including a plurality of user computing devices 104, Internet media extender 110 and television 112, is illustrated in Fig. 8.
[0069] The following is an example of an enterprise use implementation of the present application. A group of people wish to start a data and video conference collaboration session. Some of the people are located in a conference room that includes a TV and a connected Internet media extender (e.g., APPLE TV). Others of the people are in various remote geographic locations, with some using mobile computing devices (e.g., smartphones or tablet computers) and some using desktop/laptop computers. A user associated with the origin host starts the session via the Internet media extender via the TV APP and/or a mobile computing device 104 and via the MOBILE APP, which have been downloaded and installed on at least some of the respective devices 104 and 110 in the conference room and operated by the participants. After an initial configuration, which may occur the first time the MOBILE APP and/or TV APP is launched, audio/video content from one or more of the computing devices 104 is provided instantly on the television 112 via the Internet media extender 110.
[0070] Continuing with the present enterprise use example, thereafter an invitation can be transmitted, e.g., via the Internet media extender or mobile computing device 104, to one or more other users' computing devices 104. The invitation can include a hyperlink to a URL that can be sent via SMS, email, provided in the MOBILE APP or on an Internet web page. Users of computing devices 104 select the link, which configures the user computing device(s) 104 to launch the MOBILE APP and join the session automatically.
[0071] The present application supports integration of multiple cameras and microphones that can be connected remotely to an Internet media extender, such as APPLE TV. For example, a plurality of mobile computing devices 104 (e.g., iPhone/iPad/laptop) connect to a respective session and each provides audio/video feed to the Internet media extender 110 and television 112. This is similar, in practice, to a "TouchCast" studio multi-camera setup, which allows multiple cameras to feed into an authoring tool. A description of such an authoring tool is shown and described in greater detail in commonly assigned U.S. Patent No. 9,363,448, issued June 7, 2016. Supporting live audio/video feed by multiple cameras provides an advantage and technological benefit for multiple people located in the same room and/or remotely located to utilize their respective mobile devices. In one or more implementations, audio detection mechanisms can be employed such that when a user speaks, feed from the microphone and/or camera on that user's respective device is provided on audio output (e.g., speakers) associated with the television 112 (via, for example, the Internet media extender 110), as well as on connected computing devices 104 operated by people remotely located (i.e., not in the local setting). This provides a different and much improved solution to a conference room "bowling-alley-view" of a single camera located at the head of a table, which tries to capture everyone in the conference room. In one or more implementations, cameras associated with the connected computing devices 104 can be "cut to" via one of several ways. In one case, for example, the host user can make selections to switch input from various cameras/devices. In another example case, for example, the MOBILE APP configures the respective computing devices 104 with automatic speaker detection, which operates to detect when a user is speaking and input from that user's respective
camera/microphone can be presented to the other user computing devices 104 in the session. In yet another case, for example, a user proactively takes control to have audio/video feed from his or her user computing device 104, which can be effected by simply tapping on the screen of the user's computing device 104, to make that user's device 104 provide the primary feed, and can be presented to the other user computing devices 104 in the session.
[0072] In one or more implementations, the present application supports video compositing on the TV APP and/or the MOBILE APP, which can include sharing of content between a plurality of devices. The present application supports multiple configurations in which multiple users are presented during the same session. For example, a primary presenter is viewable on all the devices, and everyone else connected to the session views the primary presenter and/or content the primary presenter shares. In another configuration, co- viewing is supported, in which a plurality (e.g., all) members of the session are visible on the screen at the same time. In such a setting, live co-viewing of content viewed by participants is preferably synchronized. For example, content can include YouTube videos, and a group of viewers watch videos together and are in sync as they do so, allowing participants to control the playlist of upcoming content, effectively collaboratively creating a playlist.
Another example can include video streams that are live from an external feed. Other examples include viewing a website, and reviewing images and videos from the users' mobile devices 104 (via same stream from device 104 to the Internet media extender 110).
[0073] In yet another configuration, a hybrid mode is supported in which all participants can be presented, and one or more instructions executing on a computing device 104/data processing apparatus 102 implement an algorithm that governs which respective user computing device 104 is featured, making some device(s) more prominent as a function of current activity (e.g., someone speaking). In yet another implementation, multi-camera coverage is supported for a location (e.g., a conference room) which is configured by a primary camera positioned at or with the television 112 that is coupled to the Internet media extender 110, and that includes participants' user computing devices 104. In operation, content can cut between the full view of the room and the individual participants when they speak.
[0074] Figs. 9A-9D illustrate examples of sharing content between devices 104, in accordance with an example implementation of the present application. Examples in Figs. 9A-9D identify respective control over portions of content of interactive display screens of user computing devices 104, including as used to provide content on a plurality of devices 104, as well as for individual control over a respective user computing device 104. For example, and as illustrated in Fig. 9D, individual portions can be sized, moved, or otherwise transformed in a respective user's user computing device 104, without affecting other devices 104. Each of a plurality of users (e.g., USER 1 and USER 2) can be represented in each of the respective computing devices 104, such as shown and represented in Fig. 9B. In one or more implementations, the users are represented in video that is captured by cameras configured with the respective users' computing devices, and that is displayed substantially in real-time.
[0075] Fig. 10 illustrates an example set of applications and controls 1002 available via the MOBILE APP and operating on user computing device 104 during an interactive video conferencing session, in accordance with the present application. As shown in controls 1002, options for drawing, and providing content from online sources (e.g. webpages, Wikipedia, news sources, maps, Twitter or the like), as well as controls for drawing tools, apps (e.g., calculator, image editing/controls and audio/video feed). Other activities can be integrated, such set forth in vApps, and in the context of communications. For example, games such as Tic/Tac/Toe, Chess, Backgammon, puzzle games, draw together, trivia, cards and table game. Other games can include action games, such as first person shooter games. In addition to games, virtual adventure and exploration is supported, such as GOOGLE MAPS and 360 views, and photo galleries (e.g., FLICKR, GOOGLE IMAGES), shopping galleries, planning tools (e.g., for travel, entertaining, ordering food, etc.), and various professional consultation is supported, such as telemedicine and other professional consultations.
[0076] In addition, workplace collaboration tools are supported, and members of a given session can collaborate to create something together. Examples of collaboration and collaborative tools include a collaborative whiteboard, document creation and editing, mood board collaboration, general brainstorming, reviewing comps, training and instruction (with real-time assessment), focus group input, and can include real-time voting, sharing, collective web site and/or web application use that includes (but is not limited to) documents, spreadsheets, content management, project management, calendar planning or the like.
[0077] Thus, as shown and described herein, in one or more implementations the present application provides systems and methods for participating video conferencing and sharing that can be layered with interactive content, including content that is available over one or more data communication networks, such as the Internet. Further, one or more software applications provide a viewing/interactive tool for end-users who receive videos that are authored in accordance with the present application, including in an on-line interactive conference. Using a client interface (which may be configured in a MOBILE APP, or provided via a standard web browser application or other suitable platform), users may interact with videos as a function of touch and gestures, as well as other suitable interfaces, such as provided via an application that configures a smartphone or other mobile computing device to simulate a laser pointer, drawing tool, mouse, trackball, keyboard or other input device.
[0078] In one or more implementations of the present patent application, a processor configured with code processes information representing a selection event that occurred in the display unit. For example, a user makes a selection in a remote control software application operating on his or her mobile computing device (e.g., iPhone) in a portion of the display unit while the interactive media content in the display unit is provided therein. The processing that occurs can be to determine at least a relative time and location of the selection event that occurred in the second portion of the display. The information representing the selection event can be stored in one or more databases that are accessible to at least one computing device. The selection of an item can be processed to enable the interaction with at least a portion of the interactive media content at one of the remote devices associated with the selection event. This enables results of a respective interaction associated with the selection event to be viewable or otherwise provided at one particular remote device, but not viewable or otherwise provided at other of the remote devices.
[0079] Other functionality includes drawing and selection features, which is effective for explaining and highlighting features within a coordinated presentation, including during an interactive video conferencing session. Further, in one or more implementations, the present application supports providing virtual representations of respective users that are interfacing via a respective video conferencing session in accordance with the teachings herein. For example, one or more modules can be implemented to simulate the viewpoint for each of the respective users that one particular user is located "behind" the display screen configured with, or otherwise accessible by, user computing device 104.
[0080] In addition to simulating a respective user in the display screen being at a particular location (e.g., "behind" the screen), a partial representation of a person, such as a person' s hand or other part, can similarly be represented and displayed in dynamic ways to represent movement, such as in the context of an interactive video conferencing session. Such functionality can be provided in a particularly configured vAPP. An example is set forth in Figs. 1 lA-1 IE. Fig. 11 A illustrates an example video conferencing session displayed on a user computing device 104, in which the session is between two users (USER 1 and USER 2). As shown in Fig. 11 A, writing virtual control (e.g., icon) 1101 is included that, when selected, causes computing device 104 and/or data processing apparatus 102 to execute one or more instructions to enable a unique interactive environment which simulates the appearance of users.
[0081] Continuing with reference to Fig. 11 A, after control 1101 has been selected, the simulation effect can occur dynamically as a user interacts with user computing device 104, such as by drawing or writing on the screen, using his or her finger, an electronic passive or active stylus or other suitably configured writing device. Upon a determination by computing device 104 that a user is interacting accordingly, one or more processes can operate to detect and capture specific information associated with the interaction, such as that a user is drawing on his or her display with a stylus and the specific location where the stylus is in contact with the display. Further, the computing device 104 and/or 102 can alter the video representation of an interacting user (e.g., USER 1) on one or more of the computing devices operated by the respective other users during the interactive video conferencing session. An example is shown in Fig. 1 IB, in a display of a computing device operated by USER 2 is shown, and in which the video representation of USER 1 has been enlarged and shaded as a function of one or more filter effects (e.g. shading and size). In addition to altering the video representation of USER 1 on the computing device of USER 2, a graphical representation of a human hand 1102 has been overlaid on the display of user computing device 104. The hand 1102 is represented to be holding a writing instrument, such as a stylus (e.g., APPLE PENCIL). Furthermore, the writing instrument is configured to appear to be making contact with the surface of the display at the specific point 1104 at which USER 1 is interacting. In one or more implementations, the altered representation of USER 1 can be provided on all of the computing devices of the interactive video conferencing session, including the computing device operated by USER 1. Alternatively, the computing device operated by USER 1 is not affected while USER 1 interacts, such as by drawing on his or her display screen.
[0082] Thus, for example as shown in Figs. 11 A-l IE, by shading and enlarging the size of the video representation, USER 1 appears to be positioned on the other side or within the computing device 104 operated by another user (e.g., USER 2). This can be implemented using one or more image and/or video editing techniques. For example, adjustments can be made to the video content to implement transparency, shading, size or other suitable effects, and one or more layer masks can be used to selectively apply the adjustment. In addition or in the alternative, a video composite can be generated that dynamically changes (e.g., moves or changes positions) in accordance with respective user interaction and or activity.
Moreover, one or more overlays can be added or removed during an interactive video conferencing session in accordance with specific user activity. For example, Figs. 1 lB-1 ID illustrate video content showing USER 1 in an enlarged and shaded format, which occurs as a function of USER 1 interacting by writing on the display screen of his user computing device 104. Upon lifting his writing instrument away from the display screen, thereby ending his respective edits, the video showing USER 1 returns to its previous format (Fig. HE). In one or more implementations, when the user ceases to interact, such as by writing on his or her display, the user's writing or drawing can remain displayed on each of the other users' computing devices. Thus, the illusion of a user drawing from the other side of the display can end, yet the artwork, writings or the like remains. In this way, a dynamic representation of a user interacting with his or her user computing device 104 is provided in accordance with the teachings herein.
[0083] By altering the respective appearance of video representing a user (e.g., USER 1) and adding one or more dynamic components (e.g., hand 1102), the present application provides for a realistic and interactive experience for users of an interactive video conferencing session. In one or more implementations, as a user (e.g., USER 1) interacts by writing on his or her display, computing device 104 and/or 102 detects and monitors the activity substantially in real-time. Moreover, a portion of the display can be mapped as a series of coordinates and/or pixels, and the respective location of each interaction can be determined and used for setting a corresponding location on each user's computing device during an interactive video conferencing session. For example, as a user (e.g., USER 1) draws at specific coordinates on his or her computing device, a virtual hand 1102 can be generated and displayed on each of the other users' computing devices, with the specific point 1104 of the stylus represented as being held by the virtual hand 1102 positioned at the same coordinates. Continuing with this example, as USER 1 draws (or writes as the case may be) at different locations, the virtual hand 1102 moves and the specific point 1104 of the stylus moves to the same location(s). In one or more implementations, the hand 1102 can be configured to turn and rotate, such as particular locations within the display screen, in order to provide a more realistic and accurate representation of someone writing or drawing. In one or more implementations Non-Uniform Rational Basis Splines ("NURBs"), as known in the art, can be used to represent geometric models (e.g., hands) accurately.
[0084] By providing a virtual and dynamic representation of a user interacting as shown and described herein, the present application provides for a significantly improved collaborative tool. Fig. 12A, for example, illustrates a display screen of a computing device operated by USER 2, and shows USER 1 drawing a landscape that includes mountains and a road. USER 1 is drawing the landscape to show USER 2 a type of photographic image that USER 1 would like to use for commercial purposes. After seeing the simple drawing, USER 2 selects an icon to browse for specific content, such as images that are stored on the computing device operated by USER 2. After locating a suitable image that is locally stored and USER's 2 computing device, the image is inserted and shown to USER 1 (Fig. 12B).
[0085] Figs. 13A-13B illustrate example implementations of the present application from the perspective of a user (USER 1) drawing on a display. In the example shown in Fig. 13 A, USER 1 is preparing to draw on his display using a stylus, but has not yet made contact with the display. In Fig. 13B, USER 1 has caused his stylus to make contact with the display and is drawing. In the example shown in Fig. 13B, the enlarged and shaded representation of USER 1 is displayed on the computing device operated by USER 1.
[0086] With the exemplary computing system environment being generally shown and discussed above, the method and system of the invention in accordance with illustrated embodiments will now be discussed. It is to be appreciated that the method described herein has been indicated in connection with a flow diagram for facilitating a description of the principal processes of an illustrated embodiment of the invention; however, certain blocks can be invoked in an arbitrary order, such as when the events drive the program flow such as in an object-oriented program. Accordingly, the flow diagram is to be understood as an example flow and that the blocks can be invoked in a different order than as illustrated.
[0087] Fig. 14 illustrates a flow diagram of steps SI 00 in connection with one example implementation of the present application. At step S102, one or more parameters are received from a first user computing device 104 that represent a session (e.g., #SALES) (step S102). Using the parameter(s), a network session is defined (step S104). Thereafter, access is provided to the first user computing device 104 (step S106). The network session access can include, for example, functionality that is provided for presenter or origin devices, which can include additional rights than provided to participant or non-presenter devices. From the first user computing device 104, information associated with additional computing device(s) 104 is received (step S108). Using the information received in step SI 10, different access to the network session is provided for the additional computing device(s) (step 110).
[0088] Continuing with reference to the flow diagram shown in Fig. 14, at step SI 12 audio and/or video ("audio/video") input is received from each of the respective devices 104. For example, feed from a camera and/or microphone that is operatively connected to each computing device 104 is received by data processing apparatus 102. Thereafter, a composited audio/video feed is generated using the audio/video input (step SI 14), and the composited audio/video feed is transmitted to each of the devices 104 (step SI 16). Each of the first user computing device 104 and the additional user computing devices 104 is configured via a gesture-based user interface to enable user-based selection and respective manipulation of at least a portion of the composited audio/video feed that results in a customized representation of the composited audio- video feed at a particular one of the at least one additional user computing device that is not represented at a particular other of the at least one additional user computing device. Thereafter, the process ends (step SI 18).
[0089] Thus, as shown and described herein, various representations of users of computing devices 104 that are connected by an interactive video conferencing session can be provided dynamically and intuitively to improve collaboration and communication, more generally.
[0090] The present application can further include a feature that allows a user to simulate a laser pointer effect when presenting embedded content in a coordinated presentation. In an implementation, the on-board gyroscope provided with a mobile computing device, such as a smartphone, can be employed to locate where laser pointer functionality effectively should appear in a coordinated presentation. One or more elements within a coordinated presentation, in accordance with the present application, including in one or more vApps (substantially as shown and described in co-pending and commonly assigned Patent
Application Serial Number 14/316,536), can be interacted with as a function of the mobile computing device simulating a laser pointer. The user can perform various laser-pointer functionality, such as to navigate between slides, point to or identify respective portions of a display including in a coordinated presentation, or perform other automated tasks. The pointer can be moved around the screen as a function of the gyroscope within the mobile computing device.
[0091] In one or more implementations a mobile computing device that is configured with the MOBILE APP is configured to simulate the remote control, laser device and/or annotation and drawing tool. As noted herein, various vApps can be included in a respective interactive video conferencing session and can be manipulated (including pointed at) by a user using the remote control. This enables a user to be decoupled from the respective authoring platform, such as the iPad or other tablet computer, by using a simulated remote control, for example, across the room. In operation, for example, content that is curated, e.g., queued for inclusion in a video conferencing session, can be selected simply by pressing an option in a graphical user interface provided on the mobile computing device operating as the remote control. Interactive elements, e.g., vApps can be embedded in the conference session simply by the author selecting an option to include the element(s) in the remote control software app. The author can add and/or remove interactive elements by selecting options in the remote control app operating on the mobile computing device, which effectively frees the author from being tied to the authoring tool (e.g., the iPad).
[0092] In addition, users or participants of a conferencing session of the present application can use a remote control, pointing and annotation functionality, including on their respective mobile computing devices (e.g., smartphones) to interact with coordinated presentations. For example, a viewer of a coordinated presentation can select a vApp during playback to be presented with the content therein, and thereafter highlight, draw on or perform other interactive functionality using the simulated remote control.
[0093] Thus, using the accelerometer and/or gyroscope in the smartphone or other mobile computing device, a virtual pointer, and annotating tool or other selection tool in a coordinated presentation can be remotely controlled. In this way, a smartphone or other mobile computing devices effectively doubles as a mouse, selection device, drawing tool or other interactive device. Unlike infrared or wired connection, the mobile computing device and coordinated presentation authoring/playback device preferably communicate over Wi-Fi. The remote can ask the "master" device via Wi-Fi or other protocol, such as Bluetooth, for permission to connect therewith. The telematics of the mobile computing device, such as an accelerometer and/or gyroscope, is employed over a digital IP connection to transmit to the presentation authoring and/or playback software, which in turn functions to control a simulated laser red dot, drawing tool or other functionality, which can be configured as a core function of the presentation authoring and/or playback application.
[0094] In this way, a new form of remote control is provided by pairing a mobile computing device, such as a smartphone, with an authoring tool device, a display device, or other presentation device. Various functionality provided in connection with coordinated presentations, such as controlling teleprompters (e.g., speed), green screen functionality, whiteboard functionality, titles, interactive Web-based content, or the like, can be selected and controlled vis-a-vis the remote control in connection with the present application. As shown and described herein, the present application further provides for simulating laser pointer control, annotation and drawing devices, such as for controlling publicly viewed or accessible displays. In addition to controlling a privately viewed coordinated presentation, the present application is also useful for public speaking or other public presentation activities. Thus, in addition to remote control functionality, the present application supports simulation of a laser pointer, annotation and drawing tool.
[0095] Thus, as shown and described herein, the present application provides a simple to use, yet powerful interactive remote video conferencing platform that incorporates a plurality of computing devices, e.g., smartphones, tablets, laptops and desktops, and enables live, realtime sharing and conferencing. One or more televisions 112 can be implemented in the present application via an Internet media extender 110, and content can be provided from a plurality of remote sources, such as cameras and/or microphones configured with user computing devices 104 that are located remotely and communicating over the Internet.
[0096] Although many of the examples shown and described herein regard distribution of coordinated presentations to a plurality of users, the invention is not so limited. Although illustrated embodiments of the present invention have been shown and described, it should be understood that various changes, substitutions, and alterations can be made by one of ordinary skill in the art without departing from the scope of the present application.

Claims

What is Claimed is:
1. A system for providing respectively customizable interactive audio/video content to each of a plurality of computing devices during a networked communication session, the system comprising:
non-transitory processor readable media;
at least one processor operatively coupled to the non-transitory processor readable media, wherein the non-transitory processor readable media have instructions that, when executed by the at least one processor, causes the at least one processor to perform the following steps:
define the networked communication session as a function of parameter information received from a first user computing device;
provide first access to the networked communication session to the first user computing device;
receive, from the first user computing device, information associated with at least one additional user computing device;
provide respective access to the networked communication session to each of the at least one additional user computing device, wherein functionality afforded by the first access is different than functionality afforded by the second access;
generate a composited interactive audio/video feed comprised of audio/video input received during the networked communication session from each of the first user computing device and each of the at least one additional user computing device; and
transmit to each of the first user computing device and each of the at least one additional user computing device the composited audio/video feed, wherein each of the first user computing device and each of the at least one additional user computing device is configured via a gesture-based user interface to enable user-based selection and respective manipulation of at least a portion of the composited audio/video feed that results in a customized representation of the composited audio- video feed at a particular one of the at least one additional user computing device that is not represented at a particular other of the at least one additional user computing device.
2. The system of claim 1 , wherein the non-transitory processor readable media have further instructions that, when executed by the at least one processor, causes the at least one processor to: receive, from the first user computing device, information representing at least one video application ("vApp") that, when executed by the first user computing device and/or the at least one additional user computing device, results in functionality, information and/or content to be provided during the networked communication session; and
transmit, to at least one additional computing device, vApp identification information representing the at least one vApp, wherein the vApp identification information is usable by the at least one additional user computing device to access the vApp and/or configure the at least one additional user computing device to execute the vApp.
3. The system of claim 1, wherein the non-transitory processor readable media have further instructions that, when executed by the at least one processor, causes the at least one processor to:
receive, from the first user computing device and/or the at least one additional user computing device, information representing the at least one other user computing device; and provide the respective access to the networked communication session to each of the at least one additional user computing device as a function of the information representing the at least one other user computing device.
4. The system of claim 1 , wherein the non-transitory processor readable media have further instructions that, when executed by the at least one processor, causes the at least one processor to:
configure at least one of the first user computing device and the at least one additional user computing device to join the networked communication session.
5. The system of claim 4, wherein at least one of the user computing devices is configured as a function of a mobile computing device software application.
6. The system of claim 1, wherein the non-transitory processor readable media have further instructions that, when executed by the at least one processor, causes the at least one processor to:
configure at least one of the additional user computing device(s) to present the composited audio/video feed on a display via an Internet media extender.
7. The system of claim 1, wherein at least one of the additional user computing device(s) communicate on the networked communication session via one or more of Real Time Streaming Protocol, Web Real-Time Communication and/or hypertext transport protocol live streaming.
8. The system of claim 1, wherein the non-transitory processor readable media have further instructions that, when executed by the at least one processor, causes the at least one processor to:
control the distribution of audio/video input from at least one of the first user computing device and each of the at least one additional user computing device, wherein controlling the distribution includes including or preventing audio and/or video content received from one of the user computing devices in the composite audio/video feed.
9. The system of claim 8, wherein the non-transitory processor readable media have further instructions that, when executed by the at least one processor, causes the at least one processor to:
detect when audio input is received from one of the user computing devices; and include the audio input received the one of the user computing devices in the composited audio/video feed.
10. The system of claim 1, wherein the non-transitory processor readable media have further instructions that, when executed by the at least one processor, causes the at least one processor to:
receive from one of the user computing devices information representing an interaction by the one of the user computing devices; and
provide a representation of the interaction to each other user computing device.
11. A method for providing respectively customizable interactive audio/video content to each of a plurality of computing devices during a networked communication session, the method comprising:
defining, by at least one processor configured to execute instructions stored on non- transitory processor readable media, the networked communication session as a function of parameter information received from a first user computing device; providing, by the at least one processor, first access to the networked communication session to the first user computing device;
receiving, by the at least one processor and from the first user computing device, information associated with at least one additional user computing device;
providing, by the at least one processor, respective access to the networked communication session to each of the at least one additional user computing device, wherein functionality afforded by the first access is different than functionality afforded by the second access;
generating, by the at least one processor, a composited interactive audio/video feed comprised of audio/video input received during the networked communication session from each of the first user computing device and each of the at least one additional user computing device; and
transmitting, by the at least one processor, to each of the first user computing device and each of the at least one additional user computing device the composited audio/video feed, wherein each of the first user computing device and each of the at least one additional user computing device is configured via a gesture-based user interface to enable user-based selection and respective manipulation of at least a portion of the composited audio/video feed that results in a customized representation of the composited audio-video feed at a particular one of the at least one additional user computing device that is not represented at a particular other of the at least one additional user computing device.
12. The method of claim 11, further comprising:
receiving, by the at least one processor and from the first user computing device, information representing at least one video application ("vApp") that, when executed by the first user computing device and/or the at least one additional user computing device, results in functionality, information and/or content to be provided during the networked
communication session; and
transmitting, by the at least one processor and to at least one additional computing device, vApp identification information representing the at least one vApp, wherein the vApp identification information is usable by the at least one additional user computing device to access the vApp and/or configure the at least one additional user computing device to execute the vApp.
The method of claim 11, further comprising: receiving, by the at least one processor and from the first user computing device and/or the at least one additional user computing device, information representing the at least one other user computing device; and
providing, by the at least one processor, the respective access to the networked communication session to each of the at least one additional user computing device as a function of the information representing the at least one other user computing device.
14. The method of claim 11, further comprising:
configuring, by the at least one processor, at least one of the first user computing device and the at least one additional user computing device to join the networked communication session.
15. The method of claim 14, wherein at least one of the user computing devices is configured as a function of a mobile computing device software application.
16. The method of claim 11, further comprising:
configuring at least one of the additional user computing device(s) to present the composited audio/video feed on a display via an Internet media extender.
17. The method of claim 11, wherein at least one of the additional user computing device(s) communicate on the networked communication session via one or more of Real Time Streaming Protocol, Web Real-Time Communication and/or hypertext transport protocol live streaming.
18. The method of claim 11 , further comprising:
controlling, by the at least one processor, the distribution of audio/video input from at least one of the first user computing device and each of the at least one additional user computing device, wherein controlling the distribution includes including or preventing audio and/or video content received from one of the user computing devices in the composite audio/video feed.
19. The method of claim 18, further comprising:
detecting, by the at least one processor, when audio input is received from one of the user computing devices; and including, by the at least one processor, the audio input received the one of the user computing devices in the composited audio/video feed.
20. The method of claim 11, further comprising:
receiving, by the at least one processor and from one of the user computing devices, information representing an interaction by the one of the user computing devices; and providing, by the at least one processor, a representation of the interaction to each other user computing device.
PCT/US2016/048720 2015-08-25 2016-08-25 System and method for interactive video conferencing WO2017035368A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP16840132.1A EP3342158A4 (en) 2015-08-25 2016-08-25 System and method for interactive video conferencing

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US201562209727P 2015-08-25 2015-08-25
US62/209,727 2015-08-25
US201562242029P 2015-10-15 2015-10-15
US62/242,029 2015-10-15
US201662329081P 2016-04-28 2016-04-28
US62/329,081 2016-04-28

Publications (1)

Publication Number Publication Date
WO2017035368A1 true WO2017035368A1 (en) 2017-03-02

Family

ID=58101076

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/048720 WO2017035368A1 (en) 2015-08-25 2016-08-25 System and method for interactive video conferencing

Country Status (2)

Country Link
EP (1) EP3342158A4 (en)
WO (1) WO2017035368A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107045307A (en) * 2017-05-04 2017-08-15 安徽庆睿实业有限责任公司 Video conference remote control
CN111711784A (en) * 2020-06-15 2020-09-25 北京字节跳动网络技术有限公司 Conference control method and device, readable medium and electronic equipment
CN114073057A (en) * 2019-07-08 2022-02-18 微软技术许可有限责任公司 Server-side rendered audio using client-side audio parameters
EP3804302A4 (en) * 2018-06-02 2022-03-16 Mersive Technologies, Inc. System and method of annotation of a shared display using a mobile device
WO2022204018A1 (en) * 2021-03-22 2022-09-29 Google Llc Multi-user interaction slates for improved video conferencing

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11374990B2 (en) 2020-09-14 2022-06-28 Box, Inc. Saving an overlay annotation in association with a shared document

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110249075A1 (en) * 2010-04-07 2011-10-13 Abuan Joe S Remote Control Operations in a Video Conference
US20110254912A1 (en) * 2009-01-27 2011-10-20 Mock Wayne E Using a Touch Interface to Control a Videoconference
US20150149930A1 (en) * 2013-11-27 2015-05-28 Facebook, Inc. Communication user interface systems and methods

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006021444A2 (en) * 2004-08-25 2006-03-02 Sony Ericsson Mobile Communications Ab Electronic equipment for a wireless communication system to transmit and receive information content during ongoing communication
US20080068448A1 (en) * 2006-09-18 2008-03-20 Hansen Robert A Method for adapting a device to participate in video conference calls
CN101286865B (en) * 2008-05-14 2010-11-10 华为技术有限公司 Electronic white board realizing method, device and system in audio and video meeting
NO331338B1 (en) * 2009-06-24 2011-11-28 Cisco Systems Int Sarl Method and apparatus for changing a video conferencing layout
US20150121250A1 (en) * 2013-10-31 2015-04-30 Avaya Inc. PROVIDING INTELLIGENT MANAGEMENT FOR WEB REAL-TIME COMMUNICATIONS (WebRTC) INTERACTIVE FLOWS, AND RELATED METHODS, SYSTEMS, AND COMPUTER-READABLE MEDIA

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110254912A1 (en) * 2009-01-27 2011-10-20 Mock Wayne E Using a Touch Interface to Control a Videoconference
US20110249075A1 (en) * 2010-04-07 2011-10-13 Abuan Joe S Remote Control Operations in a Video Conference
US20150149930A1 (en) * 2013-11-27 2015-05-28 Facebook, Inc. Communication user interface systems and methods

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3342158A4 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107045307A (en) * 2017-05-04 2017-08-15 安徽庆睿实业有限责任公司 Video conference remote control
EP3804302A4 (en) * 2018-06-02 2022-03-16 Mersive Technologies, Inc. System and method of annotation of a shared display using a mobile device
CN114073057A (en) * 2019-07-08 2022-02-18 微软技术许可有限责任公司 Server-side rendered audio using client-side audio parameters
CN114073057B (en) * 2019-07-08 2023-09-08 微软技术许可有限责任公司 Method and system for server-side rendering of audio using client-side audio parameters
CN111711784A (en) * 2020-06-15 2020-09-25 北京字节跳动网络技术有限公司 Conference control method and device, readable medium and electronic equipment
CN111711784B (en) * 2020-06-15 2022-10-18 北京字节跳动网络技术有限公司 Conference control method and device, readable medium and electronic equipment
WO2022204018A1 (en) * 2021-03-22 2022-09-29 Google Llc Multi-user interaction slates for improved video conferencing
US11509863B2 (en) 2021-03-22 2022-11-22 Google Llc Multi-user interaction slates for improved video conferencing
US12022234B2 (en) 2021-03-22 2024-06-25 Google Llc Multi-user interaction slates for improved video conferencing

Also Published As

Publication number Publication date
EP3342158A1 (en) 2018-07-04
EP3342158A4 (en) 2019-04-17

Similar Documents

Publication Publication Date Title
US10033967B2 (en) System and method for interactive video conferencing
US10911716B2 (en) System and method for interactive video conferencing
US10531044B2 (en) Intelligent virtual assistant system and method
US11343293B1 (en) System and method of enabling a non-host, participant-initiated breakout session in a videoconferencing system, and simultaneously displaying a session view of a videoconferencing session and the participant-initiated breakout session
US11310463B2 (en) System and method for providing and interacting with coordinated presentations
EP3961984B1 (en) Participation queue system and method for online video conferencing
WO2017035368A1 (en) System and method for interactive video conferencing
US10121512B2 (en) System and method for providing and interacting with coordinated presentations
US11457176B2 (en) System and method for providing and interacting with coordinated presentations
WO2019056001A1 (en) System and method for interactive video conferencing
US11405587B1 (en) System and method for interactive video conferencing
US20130117704A1 (en) Browser-Accessible 3D Immersive Virtual Events
US9666231B2 (en) System and method for providing and interacting with coordinated presentations
US10297284B2 (en) Audio/visual synching system and method
US10084849B1 (en) System and method for providing and interacting with coordinated presentations
US11659138B1 (en) System and method for interactive video conferencing
WO2020033896A1 (en) System and method for providing and interacting with coordinated presentations
WO2019036033A1 (en) System and method for providing and interacting with coordinated presentations
WO2018132514A1 (en) Intelligent virtual assistant system and method
EP3175372A1 (en) System and method for providing and interacting with coordinated presentations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16840132

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2016840132

Country of ref document: EP