WO2021151205A1 - Caméra d'action en direct, commande, capture, routage, traitement et système et procédé de diffusion - Google Patents

Caméra d'action en direct, commande, capture, routage, traitement et système et procédé de diffusion Download PDF

Info

Publication number
WO2021151205A1
WO2021151205A1 PCT/CA2021/050100 CA2021050100W WO2021151205A1 WO 2021151205 A1 WO2021151205 A1 WO 2021151205A1 CA 2021050100 W CA2021050100 W CA 2021050100W WO 2021151205 A1 WO2021151205 A1 WO 2021151205A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
camera
data
cameras
frames
Prior art date
Application number
PCT/CA2021/050100
Other languages
English (en)
Inventor
Gary Shields
Original Assignee
D Serruya Consulting Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by D Serruya Consulting Ltd. filed Critical D Serruya Consulting Ltd.
Publication of WO2021151205A1 publication Critical patent/WO2021151205A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/61Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
    • H04L65/611Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for multicast or broadcast
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/45Cameras or camera modules comprising electronic image sensors; Control thereof for generating image signals from two or more image sensors being of different type or operating in different modes, e.g. with a CMOS sensor for moving images in combination with a charge-coupled device [CCD] for still images
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/50Constructional details
    • H04N23/51Housings

Definitions

  • the present invention relates to a novel system of live-action cameras which produce professional -quality video and the means to control them, capture their video output, route their video to a production system for processing and broadcasting or streaming the live video over TV networks or streaming to remote, or local viewers at the event.
  • the cameras utilized for first-person perspective live-action capture by necessity require the wireless transfer of video to the broadcast equipment.
  • This radio link tends to have limited capacity to transfer video at higher bitrates or the ability to support multiple cameras simultaneously over the same limited radio spectrum.
  • Wi-Fi was designed to deliver the maximum performance when downloading information to multiple clients.
  • the communication protocols in use were not designed to effectively handle multiple sources uploading large volumes of video information to a single receiver location.
  • standard Wi-Fi systems are designed for point-to- multipoint transmissions, and not multipoint-to-point transmissions, which is exactly the situation that you have in a distributed live-action camera system.
  • the McLennan and Harish system also use the Wi-Fi communications just in the short-range from the edge of the field to the camera. From the receiver (access node), the video is sent by wired network connections to the server for processing. This limits such a solution to deployment in small areas where a wired network connection is available.
  • any attempt to overcome this wired local network approach with a Wi-Fi based mesh network would fail as such a network is constrained by standard Wi-Fi protocols which require that each node listen to and capture the packet, and then retransmit it to the next node. Only one node can talk at a time as all nodes much be on the same Wi-Fi channel which only supports one user at a time. This means that after only three hops on this network, the effective data throughput would be l/8 th the total network speed of any one channel, a situation that is unusable for widescale live video capture and routing.
  • An additional challenge of live-event capture is that the event by necessity is captured using a combination of various camera types. Some of these cameras are connected to the broadcast system directly by video cables, some connect over wired network connections, and some are connected by wireless radio transmission to a nearby receiver which then sends the video over a high-speed wired or fiberoptic network to the broadcast equipment.
  • Some of the different camera types employed in live-action broadcasting include specialty cameras capable of new formats and viewing angles.
  • Such a camera type is the 360-degree spherical view camera, where a single camera housing contains multiple cameras, the outputs of which are combined into a 360-degree spherical projected image.
  • McLennan and Harish Another common style of camera used in live-action capture is the type described by McLennan and Harish, which is a simple single camera with a fixed focus lens attached to a helmet or other head or body position.
  • these cameras can be used for short video segments that have been edited to where the camera was actually facing something of interest to the broadcaster.
  • This style of camera is also subject to a great deal of jitter and motion artifacts from all the movement generated by the wearer.
  • None of the live-action cameras systems available provide, or prior art teaches of, a solution which incorporates a 3CMOS camera system that is capable of producing professional-quality, high-fidelity, transparent encoded, video that is transferred over a high-capacity wireless network, and that is capable of integrating and merging the video from multiple camera sources, not necessarily inside the same camera housing, in real-time and make that stable and professional quality video available in real-time for live broadcast.
  • the present invention provides a live-action camera system that contains at least one camera which transmits live video via a Wi-Fi 6 based Video Mesh Network, consisting of at least one such network node, to at least one computer server implementing an artificial neural network video processing system, where the video feed is prepared and relayed to live TV broadcasts or streamed to viewers locally at the event, or via the Internet.
  • a method providing a miniature live-action camera with one or more of a 3CMOS video sensor for increased video resolution and color fidelity.
  • a method providing a micro live-action camera comprised of one or more sub-miniature cellphone type high-resolution camera sensors.
  • a method providing a golf flagstick camera comprised of two cellphone sized high-resolution camera sensors and two super wide-angle lenses for the capture of a spherical 360-degree view of the golf green.
  • the camera incorporates flagstick movement information, combined with intended camera direction and object tracking, to keep the intended subject in the center of the video frame.
  • a method providing a dual 3CMOS 360-degree spherical capture camera In accordance with another aspect of the invention, a method providing a dual 3CMOS 360-degree spherical capture camera.
  • a method providing a 3 CMOS baseball cap mounted camera with an oversized resolution, camera motion, and object tracking to keep the subject centered in the video frame.
  • a method providing multiple cellphone sized image sensors arranged to provide the source images to generate a 360- degree hemispherical view camera that can be mounted to many surfaces, such as a helmet.
  • a method providing multiple 3 CMOS sensor modules and wide-angle lenses to provide the source images to produce a high-fidelity 360-degree hemispherical camera that can be mounted to many surfaces, such as a surfboard.
  • a method providing a cellphone sized high-resolution camera with over-resolution image capture and a wide-angle lens capable of being worn on a player's body, in various positions and mounted by various means.
  • the camera incorporates the player’s movement information to keep the camera output image pointed in the intended direction by sub-sampling the video sensor data to keep the intended target in the center of the video frame.
  • a method providing a plurality of sub-miniature cameras with cellphone sized camera modules, which are arranged in a circle around the perimeter of an MMA ring providing a 360-degree surround video capture capability.
  • the cameras are embedded into the top ring and send their data and receive power through a wired Ethernet connection.
  • the individual camera connections are collected at a network switch where all the video information is relayed to the AT360 image processing server.
  • a method providing a plurality of miniature 3 CMOS cameras are arranged in a circle around a target area, such as a golf tee block.
  • the cameras send their data and receive their power through a wired Ethernet connection.
  • the individual camera connections are collected at a network switch where all the video information is relayed to the AT360 image processing server.
  • a method providing a Video Mesh Network for transferring the video and control information through a wireless wide- area network.
  • One such method to achieve this is a Video Mesh Network created from one or more Wi-Fi 6 wireless network nodes which function as access points to the Video Mesh Network as well as network traffic routers and relay points.
  • the Video Mesh Network nodes utilize a proprietary Video Mesh Network Protocol (VMNP) which provides a mechanism to interleave video transfers from multiple cameras and format the video data to synchronize and move the high volume of video traffic throughout the network, minimizing congestion and video lag. Additionally, the VMNP provides a process for camera configuration and control as well as network configuration.
  • VMNP Video Mesh Network Protocol
  • a method providing floating Video Mesh Network nodes that supply the means to route the wireless network traffic both laterally and vertically around obstacles that would obstruct the video transmissions, such as large interfering waves when capturing video on the water.
  • the floating Video Mesh Network nodes operate in both water and air mediums to provide the necessary coverage.
  • a method for synchronizing the shutters of all connected cameras so that all the cameras capture an image at the same instant in time This facilitates the joining of multiple images from multiple cameras into a larger composite image of the scene, and the ability to capture multiple images of the event at the same moment in time as well as the ability to create a Time Shot, where video or images are retrieved by referencing this common time element.
  • a process for capturing and delivering native resolution still images from the cameras connected to the system is provided.
  • a process for merging together multiple frames of video from multiple cameras, not necessarily residing in the same camera housing, and forming them into a larger composite video frame is provided.
  • a process to incorporate camera motion information, video feature tracking, and image motion heuristics to determine the correct position and orientation of the sub-image to use for the current video frame from the oversized high-resolution raw video data stream is disclosed.
  • a process for generating multiple, specific views linked to the motion of a selected target or change in orientation of the camera is provided.
  • a method to distribute the generated video to broadcasting services and local event viewers BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 - Figure 1 depicts an overall example of possible configurations of the system using the various embodiments of the present invention.
  • FIG. 2 - Figure 2 depicts an example of one possible configuration of a camera (200) with two lenses, each with 3CMOS module and Wi-Fi 6 communications.
  • FIG. 3 - Figure 3 depicts an example of one possible configuration of a camera (202) with four lenses, each with 3CMOS module and Wi-Fi 6 communications.
  • FIG. 4 - Figure 4 depicts an example of one possible configuration of a camera (204) with a single lens with a 3CMOS module that communicates using a wired Ethernet connection.
  • FIG. 5 - Figure 5 depicts an example of one possible configuration of a dual-lens camera (206), each with a cellphone camera module video sensor and Wi-Fi 6 radio communications and capable of being mounted around a golf flagstick.
  • FIG. 6a - Figure 6a depicts an example of one possible configuration of a camera (208) with a single lens with a 3CMOS module and Wi-Fi 6 communications which is capable of being mounted to the brim of a cap.
  • FIG. 6b - Figure 6b depicts an example of one possible configuration of a camera (208) mounted on a baseball cap.
  • FIG. 7 - Figure 7 depicts an example of one possible configuration of a single-lens camera (210) with a cellphone camera module video sensor and Wi-Fi 6 radio communications, which is capable of being mounted on an athlete's body.
  • FIG. 8 - Figure 8 depicts an example of one possible configuration of a single-lens camera (212) with a cellphone camera module video sensor with wired Ethernet communications, which is capable of being embedded in the safety railing of an MMA fighting ring.
  • FIG. 9 - Figure 9 depicts an example of one possible configuration of a four-lens camera (214) using four cellphone camera module video sensors and Wi-Fi 6 radio communications, mounted on a helmet.
  • FIG. 10 - Figure 10 depicts an example of a Video Mesh Network node (100) that provides an access point into the Wi-Fi 6 wireless network.
  • FIG.11 - Figure 11 depicts an image processing workstation (400) that turns the multiple video streams into output feeds for live broadcasting.
  • FIG. 12 - Figure 12 depicts an example of a beam-splitting prism module (252) that separates the red, green, and blue bands of light into separate paths and routes these bands of light to individual monochrome CMOS video sensors.
  • a beam-splitting prism module 252 that separates the red, green, and blue bands of light into separate paths and routes these bands of light to individual monochrome CMOS video sensors.
  • FIG. 13 - Figure 13 depicts a perspective view of an example of the beam-splitting prism (252).
  • FIG. 14 - Figure 14 depicts a block diagram of camera (200).
  • FIG. 15 - Figure 15 depicts a block diagram of camera (202).
  • FIG. 16 - Figure 16 depicts a block diagram of camera (204).
  • FIG. 17 - Figure 17 depicts a block diagram of camera (206).
  • FIG. 18 - Figure 18 depicts a block diagram of camera (208).
  • FIG. 19 - Figure 19 depicts a block diagram of camera (210).
  • FIG. 20 - Figure 20 depicts a block diagram of camera (212).
  • FIG. 21 - Figure 21 depicts a block diagram of camera (214).
  • FIG. 22 - Figure 22 depicts a block diagram of a Video Mesh Network routing node and network access point (100).
  • FIG. 23 - Figure 23 depicts a block diagram of the CNN-based image processing workstation (400).
  • FIG. 24 - Figure 24 depicts one possible configuration of a Video Mesh Network.
  • FIG. 25 - Figure 25 depicts the relative positions and size of encoder frames.
  • FIG. 26 - Figure 26 depicts how multiple Wi-Fi 6 clients simultaneously use an access point.
  • FIG. 27 - Figure 27 depicts how Wi-Fi 6 radio spectrum is allotted and selected in the Video Mesh Network Protocol (VMNP).
  • VMNP Video Mesh Network Protocol
  • FIG. 28 - Figure 28 depicts how Wi-Fi 6 Resource Units are allotted and selected in the Video Mesh Network Protocol (VMNP).
  • VMNP Video Mesh Network Protocol
  • FIG. 29 - Figure 29 depicts the various data elements of the Video Mesh Network Protocol (VMNP).
  • VMNP Video Mesh Network Protocol
  • FIG.30 - Figure 30 depicts how image color purity is affected when cameras use a 3CMOS image sensor or a Bayer pattern image sensor.
  • FIG. 31 - Figure 31 depicts how image resolution is affected when cameras use a 3CMOS image sensor or a Bayer pattern image sensor.
  • FIG. 32 - Figure 32 depicts the field of view (FOV) coverage from a 360-degree surround camera system and a reference camera used for training.
  • FOV field of view
  • FIG. 33 - Figure 33 depicts the training mechanism for training a CNN to stitch images from a surround camera system.
  • FIG. 34a - Figure 34a depicts camera (212) embedded in a safety rail.
  • FIG. 34b - Figure 34b depicts a cross-section of camera (212) embedded in the safety rail.
  • FIG. 35 - Figure 35 depicts the field of view coverage for a surround camera system embedded in the safety rails and supports of an MMA fighting ring.
  • FIG. 36 - Figure 36 depicts the training mechanism for training a CNN to stitch images from a surround camera system where the cameras have parallel and angled alignment with each other, such as when used for an MMA fighting ring.
  • FIG. 37 - Figure 37 depicts various training images for training a CNN to stitch images from cameras with different orientations relative to each other.
  • FIG. 38 - Figure 38 depicts one possible image processing workflow through multiple CNNs.
  • Table 1 provides the selection code and video transmission bitrates for different levels of video transmission quality used in the Video Mesh Network Protocol (VMNP).
  • VMNP Video Mesh Network Protocol
  • FIG. 1 depicts several possible embodiments of the present invention in use in a variety of applications.
  • camera (200) is utilized to provide 360-degree spherical video coverage from a racecar on a track.
  • Camera (200) uses two 3CMOS camera modules (252) with fisheye lenses to capture a 360-degree spherical image.
  • the RGB outputs from each 3 CMOS camera module (252) feed into the related FPGA (254) which combines the separate color data together into a new RGB image data stream
  • the audio input from the associated microphone (250) is added to the data stream that is sent to the video encoder (230) which takes the audio and video streams from both 3 CMOS camera modules and turns them into two separate series of h.265 encoded audio/video frames.
  • the CPU (220) connects to peripherals SD Card (224), inertial measurement unit (IMU) (226), GPS (227), h.265 video encoder (230) and the Wi-Fi 6 radio module (232).
  • the IMU (226) uses its internal X, Y, Z accelerometers and Yaw, Pitch, and Roll gyroscope data to calculate the motion of the camera relative to a set reference point.
  • the result is a targeting vector that points to where the reference point is relative to the camera's current position and is used for stabilizing the camera motion in the video and to assist in keeping the video centered on the subject of interest.
  • This targeting vector is read by CPU (220) and passed to the AT360 image processing system (400) by inserting it into the data of the two encoded video streams.
  • GPS (227) tracks the camera’s position and reports this information to the CPU (220) for inclusion in the video stream data along with the targeting vector.
  • the GPS (227) also performs the function of an accurate time source for the real time clock (RTC) inside the CPU (220).
  • RTC real time clock
  • This RTC is synchronized with the RTC in all the other cameras using the network time protocol (NTP).
  • NTP network time protocol
  • the RTC is used to synchronize the triggering of the video sensor shutter so that all cameras capture images at the exact same instant in time, and for producing accurate synchronized timestamps for every video frame capture.
  • the CPU (220) reads the frame data generated by the encoder (230) and stores it in temporary buffer structures in RAM (222), for transmitting via the Wi-Fi 6 module (232) when it is its turn to send the data.
  • the video frames accumulate in the video buffer and when the structure has reached one minute of video in the buffer, the one-minute block of video is written to a timestamped file on the SD Card (224) and a new buffer structure is created for the next minute of video frames.
  • the oldest one-minute buffer is deleted from the RAM(222).
  • the SD Card (224) reaches a predetermined limit of remaining free space the oldest one-minute file is deleted from the SD-Card (224) to make space for more buffered video frames. These buffers of video frames are used to provide the data should retransmission be required, or if the video is being recalled from a later period in time. First, the CPU would try to retrieve the requested video information for the RAM (222), and if not present there, from the SD Card (224).
  • the user interface panel (228) contains an LCD display for relaying information to the user, software-defined buttons for selecting options presented on the display, and an indicator LED which shows the current state of the camera.
  • the states include green to indicate the camera is powered on, yellow to indicate the camera is in preview mode and sending lower resolution video, and red to indicate that the camera is live and delivering high-resolution video.
  • a blinking red LED indicates a fault with the camera.
  • the power supply (236) uses the energy stored in battery (240) to power the camera.
  • the battery is charged by the power supply (236) from energy transferred by the wireless charging coil (238) which is embedded in the quick disconnect plate at the base of the camera body.
  • Video Mesh Network connects to and communicates with the Video Mesh Network using standard Wi-Fi 6 protocols and transmits its video frames using the Video Mesh Network Protocol (180) to the nearest assigned Video Mesh Network node (100).
  • the Video Mesh Network nodes then route the video frame data through the Video Mesh Network comprised of multiple Video Mesh Network nodes (100).
  • the Video Mesh Network nodes (100) are loaded with routing and radio spectrum configuration information via the Video Mesh Network Protocol (180), which directs each node on where to send the incoming packets and on which resource units (RU) the information is to be transmitted or received during this transmission opportunity (TXOP) time slice.
  • Video Mesh Network Protocol 180
  • RU resource units
  • the video frames are moved through the Video Mesh Network nodes (100) until they reach a root node where they exit the Video Mesh Network and proceed to their final destination address through a wired Ethernet connection (108) to a router (120) which relays the video frame to the AT360 image processing workstation (400).
  • the AG360 image processing system (400) collects the individual frames from multiple cameras and assembles them back into their individual video streams.
  • the frames of these streams are processed through a series of convolutional neural network (CNN) processors (600) where the images undergo any needed quality adjustments and the spatially related frames are merged into a larger composite virtual video frame from which the desired view to be broadcast is selected.
  • CNN convolutional neural network
  • the output from the AI-360 image processing workstation (400) is sent to multiple destinations through Ethernet cable (108) via router (120), where it is distributed to local onsite TV broadcast equipment for over the air broadcasts, streamed to local event viewers (500) through the Video Mesh Network nodes (100) using standard Wi-Fi protocols for maximum compatibility with older mobile devices, and transferred via the Internet (140) to a video transcoding and streaming service provider (106) for distribution to remote viewers through the Internet (140) to their remote viewing devices (130).
  • camera (202) is utilized to provide 360-degree hemispherical video coverage of a surfer.
  • Camera (202) uses four 3CMOS camera modules (252) to capture a 360-degree hemispherical image around the camera.
  • the RGB outputs from each 3 CMOS camera module (252) feed into the related FPGA (254) which combines the separate color data together into a new RGB image data stream.
  • the audio input from the associated microphone (250) is added to the data stream that is sent to the video encoder (230) which takes the audio and video streams from both 3CMOS camera modules and turns them into four separate series of h.265 encoded audio/video frames.
  • the CPU (220) connects to peripherals SD Card (224), inertial measurement unit (IMU) (226), GPS (227), h.265 video encoder (230) and the Wi-Fi 6 radio module (232).
  • the IMU (226) uses its internal X, Y, Z accelerometers and Yaw, Pitch, and Roll gyroscope data to calculate the motion of the camera relative to a set reference point.
  • the result is a targeting vector that points to where the reference point is relative to the camera's current position and is used for stabilizing the camera motion in the video and to assist in keeping the video centered on the subject of interest.
  • This targeting vector is read by CPU (220) and passed to the AI-360 image processing system (400) by inserting it into the data of the two encoded video streams.
  • GPS (227) tracks the camera’s position and reports this information to the CPU (220) for inclusion in the video stream data along with the targeting vector.
  • the GPS (227) also performs the function of an accurate time source for the real time clock (RTC) inside the CPU (220).
  • RTC real time clock
  • This RTC is synchronized with the RTC in all the other cameras using the network time protocol (NTP).
  • NTP network time protocol
  • the RTC is used to synchronize the triggering of the video sensor shutter so that all cameras capture images at the exact same instant in time, and for producing accurate synchronized timestamps for every video frame capture.
  • the CPU (220) reads the frame data generated by the encoder (230) and stores it in temporary buffer structures in RAM (222), for transmitting via the Wi-Fi 6 module (232) when it is its turn to send the data.
  • the video frames accumulate in the video buffer and when the structure has reached one minute of video in the buffer, the one-minute block of video is written to a timestamped file on the SD Card (224) and a new buffer structure is created for the next minute of video frames.
  • the oldest one-minute buffer is deleted from the RAM(222).
  • the SD Card (224) reaches a predetermined limit of remaining free space the oldest one-minute file is deleted from the SD-Card (224) to make space for more buffered video frames. These buffers of video frames are used to provide the data should retransmission be required, or if the video is being recalled from a later period in time. First, the CPU would try to retrieve the requested video information for the RAM (222), and if not present there, from the SD Card (224).
  • the user interface panel (228) contains an LCD display for relaying information to the user, software-defined buttons for selecting options presented on the display, and an indicator LED which shows the current state of the camera.
  • the states include green to indicate the camera is powered on, yellow to indicate the camera is in preview mode and sending lower resolution video, and red to indicate that the camera is live and delivering high-resolution video.
  • a blinking red LED indicates a fault with the camera.
  • the power supply (236) uses the energy stored in battery (240) to power the camera.
  • the battery is charged by the power supply (236) from energy transferred by the wireless charging coil (238) which is embedded in the quick disconnect plate at the base of the camera body.
  • Camera (202) connects to and communicates with the Video Mesh Network using standard Wi-Fi 6 protocols and transmits its video frames to a floating Video Mesh Network comprised of one or more nodes (101a) or (102a) using the Video Mesh Network Protocol (180).
  • the floating Video Mesh Network nodes (101a) are attached to any manner of water-based floating craft, including JetSkis (101b), small watercraft, or floating platforms.
  • the floating Video Mesh Network nodes (102a) are attached to any manner of air-based floating craft, including miniature blimps (102b), drones, or floating lighter-than- air platforms. These floating nodes capture and relay video information from camera (202) where line of sight transmission from the camera (202) to the shore Video Mesh Network node (100) is blocked by water or other obstructions.
  • the video information is transferred through the floating nodes (101a) or (102a) until it is delivered to a Video Mesh Network node on land (100) and delivered to the production truck (401).
  • the AT 360 image processing system (400) collects the individual frames from multiple cameras and assembles them back into their individual video streams.
  • the frames of these streams are processed through the convolutional neural network (CNN) processors (600) where the images undergo any needed quality adjustments and spatially related frames are merged into a larger composite frame from which the desired view to be broadcast is selected from.
  • CNN convolutional neural network
  • the output from the AT 360 image processing workstation (400) is sent to multiple destinations.
  • One live video feed is streamed to local event viewers (500) using standard Wi-Fi protocols for maximum compatibility with older mobile devices.
  • Another live feed is sent to multiple destinations through the satellite uplink (103) where it is relayed by satellite (104) back to a video transcoding and streaming service provider (106) for distribution to remote viewers through the Internet (140) to their remote viewing devices (130) and to broadcasting equipment for over the air broadcasts.
  • a 360-degree surround capture system is created using multiple cameras (204) arranged in a circle around the target area.
  • the surround capture system takes video from multiple fixed positions around a target area, in this instance a golf tee block, and sends them to the AI-360 image processing workstation (400) where they are then fused together to create a large virtual scene.
  • This scene can have a virtual camera moved around the circle surrounding the target subject and lets the viewers see the subject from multiple camera angles as if a real camera was being moved around the scene.
  • the RGB outputs from the 3CMOS camera module (252) feed into the related FPGA (254) which combines the separate color data together into a new RGB image data stream.
  • the audio input from the associated microphone (250) is added to the data stream that is sent to CPU (220) which takes the audio and video streams and turns them into a series of h.265 encoded audio/video frames.
  • the CPU (220) connects to peripherals SD Card (224), inertial measurement unit (IMU) (226), GPS (227), and the Ethernet module (234).
  • the IMU (226) uses its internal X, Y, Z accelerometers and Yaw, Pitch, and Roll gyroscope data to calculate the motion of the camera relative to a set reference point.
  • the result is a targeting vector that points to where the reference point is relative to the camera's current position and is used for stabilizing the camera motion in the video and to assist in keeping the video centered on the subject of interest.
  • This targeting vector is read by CPU (220) and passed to the AT360 image processing system (400) by inserting it into the data of the encoded video stream.
  • GPS (227) tracks the camera’s position and reports this information to the CPU (220) for inclusion in the video stream data along with the targeting vector.
  • the GPS (227) also performs the function of an accurate time source for the real time clock (RTC) inside the CPU (220).
  • RTC real time clock
  • This RTC is synchronized with the RTC in all the other cameras using the network time protocol (NTP).
  • NTP network time protocol
  • the RTC is used to synchronize the triggering of the video sensor shutter so that all cameras capture images at the exact same instant in time, and for producing accurate synchronized timestamps for every video frame captured.
  • the CPU (220) takes the generated frame data and stores it in temporary buffer structures in RAM (222) for transmitting via the Ethernet module (234).
  • the video frames accumulate in the video buffer and when the structure has reached one minute of video in length, the one-minute block of video is written to a timestamped file on the SD Card (224) and a new buffer structure is created for the next minute of video frames.
  • the oldest one-minute buffer is deleted from the RAM(222).
  • the SD Card (224) reaches a predetermined limit of remaining free space the oldest one-minute file is deleted from the SD-Card (224) to make space for more buffered video frames. These buffers of video frames are used to provide the data should retransmission be required, or if the video is being recalled from a later period in time. First, the CPU would try to retrieve the requested video segment from the RAM (222), and if not present there, from the SD Card (224).
  • the user interface panel (229) contains an indicator LED which shows the current state of the camera.
  • the states include green to indicate the camera is powered on, yellow to indicate the camera is in preview mode and sending lower resolution video, and red to indicate that the camera is live and delivering high-resolution video.
  • a blinking red LED indicates a fault with the camera.
  • the power supply (242) uses Power Over Ethernet (PoE) from the Ethernet module (234) to power the camera.
  • the surround cameras (204) connect to a wired Ethernet network using Ethernet cables (108) and communicate with the AT 360 image processing workstation (400) using the Video Mesh Network Protocol (180).
  • the video data is routed to the AI-360 image processing workstation (400) via router (120).
  • the AT360 image processing system (400) collects the individual frames from the multiple cameras of the surround system and assembles them back into their individual video streams.
  • the frames of these streams are processed through the convolutional neural network (CNN) processors (600) where the images undergo any needed quality adjustments and spatially related frames are merged into a larger composite frame from which the desired view to be broadcast is selected from.
  • CNN convolutional neural network
  • This virtual frame forms a circle around the target subject and the operator can select a view of the target from a virtual camera that can be moved in a circle around that target providing the viewers with different views of the scene.
  • the output from the AI-360 image processing workstation (400) is sent to multiple destinations through Ethernet cable (108) to router (120), where it is distributed to local onsite TV broadcast equipment for over the air broadcasts, streamed to local event viewers (500) through the Video Mesh Network nodes (100) using standard Wi-Fi protocols for maximum compatibility with older mobile devices, and transferred via the Internet (140) to a video transcoding and streaming service provider (106) for distribution to remote viewers through the Internet (140) to their remote viewing devices (130).
  • camera (206) is utilized to provide 360-degree spherical video coverage from a golf flagstick.
  • This camera uses a pair of fisheye lenses coupled with a micro-sized high-resolution cellphone camera module (256), to capture a 360-degree spherical image surrounding itself.
  • the camera (206) also captures its own motion using the inertial measurement unit (IMU) (226) which generates a vector to a user- designated target point in the scene.
  • the IMU calculates a vector that tells the AI ⁇ 360 image processing system (400) were in the virtual scene the virtual camera is to be pointed, thereby stabilizing the camera movement as well as the video image and removing the unwanted camera motion.
  • IMU inertial measurement unit
  • the video encoder (230) combines the video output from the cellphone camera modules (256) with the audio from the associated microphone (250) and turns them into a series of h.265 encoded audio/video frames.
  • the CPU (220) connects to peripherals SD Card (224), inertial measurement unit (IMU) (226), GPS (227), h.265 video encoder (230) and the Wi-Fi 6 radio module (232).
  • the EMU (226) uses its internal X, Y, Z accelerometers and Yaw, Pitch, and Roll gyroscope data to calculate the motion of the camera relative to a set reference point.
  • the result is a targeting vector that points to where the reference point is relative to the camera's current position and is used for stabilizing the camera motion in the video and to assist in keeping the video centered on the subject of interest.
  • This targeting vector is read by CPU (220) and passed to the AT 360 image processing system (400) by inserting it into the data of the two encoded video streams.
  • GPS (227) tracks the camera’s position and reports this information to the CPU (220) for inclusion in the video stream data along with the targeting vector.
  • the GPS (227) also performs the function of an accurate time source for the real time clock (RTC) inside the CPU (220).
  • RTC real time clock
  • This RTC is synchronized with the RTC in all the other cameras using the network time protocol (NTP).
  • NTP network time protocol
  • the RTC is used to synchronize the triggering of the video sensor shutter so that all cameras capture images at the exact same instant in time, and for producing accurate synchronized timestamps for every video frame capture.
  • the CPU (220) reads the frame data generated by the encoder (230) and stores it in temporary buffer structures in RAM (222), for transmitting via the Wi-Fi 6 module (232) when it is its turn to send the data.
  • the video frames accumulate in the video buffer and when the structure has reached one minute of video in the buffer, the one-minute block of video is written to a timestamped file on the SD Card (224) and a new buffer structure is created for the next minute of video frames.
  • the oldest one-minute buffer is deleted from the RAM(222).
  • the SD Card (224) reaches a predetermined limit of remaining free space the oldest one-minute file is deleted from the SD-Card (224) to make space for more buffered video frames.
  • These buffers of video frames are used to provide the data should retransmission be required, or if the video is being recalled from a later period in time.
  • the CPU would try to retrieve the requested video information for the RAM (222), and if not present there, from the SD Card (224).
  • the user interface panel (229) contains an indicator LED which shows the current state of the camera.
  • the states include green to indicate the camera is powered on, yellow to indicate the camera is in preview mode and sending lower resolution video, and red to indicate that the camera is live and delivering high-resolution video.
  • a blinking red LED indicates a fault with the camera.
  • the power supply (246) uses stored energy in the field- replaceable battery (249) to power the camera.
  • Video Mesh Network Protocol 180
  • Video Mesh Network node 100
  • the Video Mesh Network nodes then route the video frame through the Video Mesh Network comprised of multiple Video Mesh Network nodes (100) or by rerouting the video data to a wired Ethernet connection.
  • Such routing of the video data from the wireless network to a wired network can happen for multiple reasons, including to bypass a transmission obstacle (109) or to send the data a long distance via a fiber optic link (110), eliminating the need for the data to pass through multiple nodes of the wireless network.
  • the video data can be reintroduced into the wireless network by connecting the wired Ethernet cable (108) to an available Video Mesh Network node (100).
  • the Video Mesh Network nodes (100) are loaded with routing and radio spectrum configuration information via the Video Mesh Network Protocol (180), which directs each node on where to send the incoming packets and on which resource units (RU) the information is to be transmitted or received during this transmission opportunity (TXOP) time slice, and which traffic should get rerouted to a wired Ethernet connection at that node.
  • Video Mesh Network Protocol 180
  • RU resource units
  • TXOP transmission opportunity
  • the video frames are moved through the various networks until they reach their final destination address through a wired Ethernet connection (108) to a router (120) which relays the video frame to the AT360 image processing workstation (400).
  • the AT360 image processing system (400) collects the individual frames from multiple cameras and assembles them back into their individual video streams.
  • the frames of these streams are processed through the convolutional neural network (CNN) processors (600) where the images undergo any needed quality adjustments and spatially related frames are merged into a larger composite frame from which the desired view to be broadcast is selected from.
  • CNN convolutional neural network
  • the output from the AT 360 image processing workstation (400) is sent to multiple destinations through Ethernet cable (108) to router (120), where it is distributed to local onsite TV broadcast equipment for over the air broadcasts, streamed to local event viewers (500) through the Video Mesh Network nodes (100) using standard Wi-Fi protocols for maximum compatibility with older mobile devices, and transferred via the Internet (140) to a video transcoding and streaming service provider (106) for distribution to remote viewers through the Internet (140) to their remote viewing devices (130).
  • camera (208) is utilized to provide a first-person point of view camera; in this case, mounted to a baseball cap and captured during a baseball game.
  • the camera (208) houses a 3 CMOS camera module (252) and captures its own motion using the inertial measurement unit (IMU) (226) which generates a vector to a user- designated target point in the scene.
  • IMU inertial measurement unit
  • the IMU (226) calculates a vector that tells the AI-360 image processing system (400) were in the virtual scene the virtual camera is to be pointed, thereby stabilizing the camera motion and the video image and removing the unwanted camera motion as the wearer engages in the sporting activity.
  • the RGB outputs from the 3CMOS camera module (252) feed into the related FPGA (254) which combines the separate color data together into a new RGB image data stream.
  • the audio input from the associated microphone (250) is added to the data stream that is sent to the CPU (220) which takes the audio and video streams from the 3CMOS camera module and turns them into a series of h.265 encoded audio/video frames.
  • the CPU (220) connects to peripherals SD Card (224), inertial measurement unit (IMU) (226), and the Wi-Fi 6 radio module (232).
  • the EMU (226) uses its internal X, Y, Z accelerometers and Yaw, Pitch, and Roll gyroscope data to calculate the motion of the camera relative to a set reference point.
  • the result is a targeting vector that points to where the reference point is relative to the camera’ s current position and is used for stabilizing the camera motion in the video and to assist in keeping the video centered on the subject of interest.
  • This targeting vector is read by CPU (220) and passed to the AT 360 image processing system (400) by inserting it into the data of the encoded video stream.
  • the real-time clock (RTC) inside the CPU (220) is synchronized with the RTC in all the other cameras using the network time protocol (NTP).
  • NTP network time protocol
  • the RTC is used to synchronize the triggering of the video sensor shutter so that all cameras capture images at the exact same instant in time, and for producing accurate synchronized timestamps for every video frame capture.
  • the CPU (220) takes the encoded video frame data and stores it in temporary buffer structures in RAM (222), for transmitting via the Wi-Fi 6 module (232) when it is its turn to send the data.
  • the video frames accumulate in the video buffer and when the structure has reached one minute of video in the buffer, the one-minute block of video is written to a timestamped file on the SD Card (224) and a new buffer structure is created for the next minute of video frames.
  • the RAM (222) reaches a predetermined limit of remaining free space
  • the oldest one-minute buffer is deleted from the RAM(222).
  • the SD Card (224) reaches a predetermined limit of remaining free space the oldest one- minute file is deleted from the SD-Card (224) to make space for more buffered video frames.
  • These buffers of video frames are used to provide the data should retransmission be required, or if the video is being recalled from a later period in time.
  • the CPU would try to retrieve the requested video information for the RAM (222), and if not present there, from the SD Card (224).
  • the user interface panel (229) contains an indicator LED which shows the current state of the camera.
  • the states include green to indicate the camera is powered on, yellow to indicate the camera is in preview mode and sending lower resolution video, and red to indicate that the camera is live and delivering high-resolution video.
  • a blinking red LED indicates a fault with the camera.
  • the power supply (243) uses stored energy in battery (249) to power the camera. The battery is charged by the power supply (243) from power supplied by the USB (233).
  • Video Mesh Network Protocol 180
  • Video Mesh Network nodes then route the video frame through the Video Mesh Network comprised of multiple Video Mesh Network nodes (100).
  • the Video Mesh Network nodes (100) are loaded with routing and radio spectrum configuration information via the Video Mesh Network Protocol (180), which directs each node on where to send the incoming packets and on which resource units (RU) the information is to be transmitted or received during this transmission opportunity (TXOP) time slice, and which traffic should get rerouted to a wired Ethernet connection at that node.
  • Video Mesh Network Protocol 180
  • RU resource units
  • TXOP transmission opportunity
  • the video frames are moved through the various networks until they reach their final destination address through a root node with a wired Ethernet connection (108) to a router (120) which relays the video frame to the AT 360 image processing workstation (400).
  • the AT 360 image processing system (400) collects the individual frames from multiple cameras and assembles them back into their individual video streams.
  • the frames of these streams are processed through the convolutional neural network (CNN) processors (600) where the images undergo any needed quality adjustments and spatially related frames are merged into a larger composite frame from which the desired view to be broadcast is selected from.
  • CNN convolutional neural network
  • the output from the AT 360 image processing workstation (400) is sent to multiple destinations through Ethernet cable (108) to router (120), where it is distributed to local onsite TV broadcast equipment for over the air broadcasts, streamed to local event viewers (500) through the Video Mesh Network nodes (100) using standard Wi-Fi protocols for maximum compatibility with older mobile devices, and transferred via the Internet (140) to a video transcoding and streaming service provider (106) for distribution to remote viewers through the Internet (140) to their remote viewing devices (130).
  • camera (210) is utilized to provide a first-person point of view camera; in this case, mounted to a basketball player’s jersey and capturing video during a basketball game.
  • the camera (210) houses a micro-sized high-resolution cellphone camera module (256).
  • the camera (210) also captures its own motion using the inertial measurement unit (IMU) (226) which generates a vector to a user-designated target point in the scene.
  • IMU inertial measurement unit
  • the IMU (226) calculates a vector that tells the AT360 image processing system (400) were in the virtual scene the virtual camera is to be pointed, thereby stabilizing the camera motion and the video image and removing the unwanted camera motion as the wearer engages in the sporting activity.
  • the CPU (220) combines the video output from the cellphone camera module (256) with the audio from the associated microphone (250) and turns them into a series of h.265 encoded audio/video frames.
  • the CPU (220) connects to peripherals SD Card (224), inertial measurement unit (IMU) (226), and the Wi-Fi 6 radio module (232).
  • the IMU (226) uses its internal X, Y, Z accelerometers and Yaw, Pitch, and Roll gyroscope data to calculate the motion of the camera relative to a set reference point.
  • the result is a targeting vector that points to where the reference point is relative to the camera’ s current position and is used for stabilizing the camera motion in the video and to assist in keeping the video centered on the subject of interest.
  • This targeting vector is read by CPU (220) and passed to the AT 360 image processing system (400) by inserting it into the data of the encoded video stream.
  • the real-time clock (RTC) inside the CPU (220) is synchronized with the RTC in all the other cameras using the network time protocol (NTP).
  • NTP network time protocol
  • the RTC is used to synchronize the triggering of the video sensor shutter so that all cameras capture images at the exact same instant in time, and for producing accurate synchronized timestamps for every video frame capture.
  • the CPU (220) takes the encoded video frame data and stores it in temporary buffer structures in RAM (222), for transmitting via the Wi-Fi 6 module (232) when it is its turn to send the data.
  • the video frames accumulate in the video buffer and when the structure has reached one minute of video in the buffer, the one-minute block of video is written to a timestamped file on the SD Card (224) and a new buffer structure is created for the next minute of video frames.
  • the oldest one-minute buffer is deleted from the RAM(222).
  • the SD Card (224) reaches a predetermined limit of remaining free space the oldest one-minute file is deleted from the SD-Card (224) to make space for more buffered video frames.
  • These buffers of video frames are used to provide the data should retransmission be required, or if the video is being recalled from a later period in time.
  • the CPU would try to retrieve the requested video information for the RAM (222), and if not present there, from the SD Card (224).
  • the user interface panel (229) contains an indicator LED which shows the current state of the camera.
  • the states include green to indicate the camera is powered on, yellow to indicate the camera is in preview mode and sending lower resolution video, and red to indicate that the camera is live and delivering high-resolution video.
  • a blinking red LED indicates a fault with the camera.
  • the power supply (243) uses stored energy in battery (249) to power the camera.
  • the battery is charged by the power supply (243) from power supplied by the USB (233).
  • Video Mesh Network Protocol 180
  • Video Mesh Network node 100
  • the Video Mesh Network nodes then route the video frames through wired Ethernet cables (108) to a local router (120).
  • the Video Mesh Network nodes (100) are loaded with routing and radio spectrum configuration information via the Video Mesh Network Protocol (180), which directs each node on which resource units (RU) the information is to be transmitted or received during this transmission opportunity (TXOP) time slice, and which traffic should get rerouted to a wired Ethernet connection at that node.
  • TXOP transmission opportunity
  • the router (120) relays the video frames to the AT360 image processing workstation (400).
  • the AT 360 image processing system (400) collects the individual frames from multiple cameras and assembles them back into their individual video streams.
  • the frames of these streams are processed through the convolutional neural network (CNN) processors (600) where the images undergo any needed quality adjustments and spatially related frames are merged into a larger composite frame from which the desired view to be broadcast is selected from.
  • CNN convolutional neural network
  • the output from the AG360 image processing workstation (400) is sent to multiple destinations through Ethernet cable (108) to router (120), where it is distributed to local onsite TV broadcast equipment for over the air broadcasts, streamed to local event viewers (500) through the Video Mesh Network nodes (100) using standard Wi-Fi protocols for maximum compatibility with older mobile devices, and transferred via the Internet (140) to a video transcoding and streaming service provider (106) for distribution to remote viewers through the Internet (140) to their remote viewing devices (130).
  • a 360-degree surround capture system is created using multiple cameras (212) arranged around a mixed martial arts ring and embedded into the top safety rail as depicted in FIG. 34a and in cross-section FIG. 34b, and the comer support posts in a similar manner.
  • the field of view of the MMA ring surround capture system is depicted in FIG. 35.
  • the surround capture system takes video from multiple fixed positions around the target area, which in this instance is a pair of fighters, and sends the images to the AT360 image processing workstation (400) where they are then fused together to create a large virtual scene.
  • This scene can have a virtual camera moved around the circle surrounding the target subjects and lets the viewers see the action from multiple camera angles as if a real camera was being moved around in the scene.
  • the CPU (220) combines the video output from the high- resolution camera module (256) with the audio from the associated microphone (250) and turns them into a series of h.265 encoded audio/video frames.
  • the CPU (220) connects to peripherals inertial measurement unit (IMU) (226), and the Ethernet module (234).
  • IMU peripherals inertial measurement unit
  • the IMU (226) uses its internal X, Y, Z accelerometers and Yaw, Pitch, and Roll gyroscope data to calculate the motion of the camera relative to a set reference point.
  • the result is a targeting vector that points to where the reference point is relative to the camera’ s current position and is used for stabilizing the camera motion in the video and to assist in keeping the video centered on the subject of interest.
  • This targeting vector is read by CPU (220) and passed to the AT 360 image processing system (400) by inserting it into the data of the encoded video stream.
  • the real-time clock (RTC) inside the CPU (220) is synchronized with the RTC in all the other cameras using the network time protocol (NTP).
  • NTP network time protocol
  • the RTC is used to synchronize the triggering of the video sensor shutter so that all cameras capture images at the exact same instant in time, and for producing accurate synchronized timestamps for every video frame capture.
  • the CPU (220) takes the encoded video frame data and stores it in temporary buffer structures in RAM (222), for transmitting via the Ethernet module (234) when it is its turn to send the data.
  • the oldest one-minute buffer is deleted from the RAM(222). These buffers of video frames are used to provide the data should retransmission be required, or if the video is being recalled from a later period in time.
  • the power supply (242) uses Power Over Ethernet (PoE) from the Ethernet module (234) to power the camera.
  • the surround cameras (212) connect to a wired Ethernet network using Ethernet cables (108) and communicate with the AT 360 image processing workstation (400) using the Video Mesh Network Protocol (180) and routed to the AI-360 image processing workstation (400) via router (120).
  • the AT360 image processing system (400) collects the individual frames from the multiple cameras of the surround system and assembles them into their individual video streams.
  • the frames of these streams are processed through the convolutional neural network (CNN) processors (600) where the images undergo any needed quality adjustments and spatially related frames are merged into a larger composite frame from which the desired view to be broadcast is selected.
  • CNN convolutional neural network
  • This virtual frame encircles the MMA ring and the system operator can select the view of the fighters from a virtual camera that can be moved in a circle around the ring.
  • the output from the AT 360 image processing workstation (400) is sent to multiple destinations through Ethernet cable (108) to router (120), where it is distributed to local onsite TV broadcast equipment for over the air broadcasts, streamed to local event viewers (500) through the Video Mesh Network nodes (100) using standard Wi-Fi protocols for maximum compatibility with older mobile devices, and transferred via the Internet (140) to a video transcoding and streaming service provider (106) for distribution to remote viewers through the Internet (140) to their remote viewing devices (130).
  • the IMU output permits the AI 360 image processing workstation (400) to effectively remove the camera motion as participants bump into the safety rails, producing a steady video of the action.
  • camera (214) is utilized to provide 360-degree hemispherical first-person video coverage, mounted on a helmet.
  • Camera (214) uses four micro-sized cellphone camera modules (256) to capture a 360-degree hemispherical image around the camera.
  • the camera (214) also captures its own motion using the inertial measurement unit (IMU) (226) which generates a vector to a user-designated target point in the scene.
  • IMU inertial measurement unit
  • the IMU calculates a vector that tells the AT 360 image processing system (400) were in the virtual scene the virtual camera is to be pointed, thereby stabilizing the camera movement as well as the video image and removing the unwanted camera motion.
  • the video encoder (230) combines the video output from the cellphone camera modules (256) with the audio from the associated microphone (250) and turns them into a series of h.265 encoded audio/video frames.
  • the CPU (220) connects to peripherals SD Card (224), inertial measurement unit (IMU) (226), GPS (227), h.265 video encoder (230) and the Wi-Fi 6 radio module (232).
  • the IMU (226) uses its internal X, Y, Z accelerometers and Yaw, Pitch, and Roll gyroscope data to calculate the motion of the camera relative to a set reference point.
  • the result is a targeting vector that points to where the reference point is relative to the camera’ s current position and is used for stabilizing the camera motion in the video and to assist in keeping the video centered on the subject of interest.
  • This targeting vector is read by CPU (220) and passed to the AT 360 image processing system (400) by inserting it into the data of the two encoded video streams.
  • GPS (227) tracks the camera’s position and reports this information to the CPU (220) for inclusion in the video stream data along with the targeting vector.
  • the GPS (227) also performs the function of an accurate time source for the real time clock (RTC) inside the CPU (220).
  • RTC real time clock
  • This RTC is synchronized with the RTC in all the other cameras using the network time protocol (NTP).
  • NTP network time protocol
  • the RTC is used to synchronize the triggering of the video sensor shutter so that all cameras capture images at the exact same instant in time, and for producing accurate synchronized timestamps for every video frame capture.
  • the CPU (220) reads the frame data generated by the encoder (230) and stores it in temporary buffer structures in RAM (222), for transmitting via the Wi-Fi 6 module (232) when it is its turn to send the data.
  • the video frames accumulate in the video buffer and when the structure has reached one minute of video in the buffer, the one-minute block of video is written to a timestamped file on the SD Card (224) and a new buffer structure is created for the next minute of video frames.
  • the oldest one-minute buffer is deleted from the RAM(222).
  • the SD Card (224) reaches a predetermined limit of remaining free space the oldest one-minute file is deleted from the SD-Card (224) to make space for more buffered video frames.
  • These buffers of video frames are used to provide the data should retransmission be required, or if the video is being recalled from a later period in time.
  • the CPU would try to retrieve the requested video information for the RAM (222), and if not present there, from the SD Card (224).
  • the user interface panel (229) contains an indicator LED which shows the current state of the camera.
  • the states include green to indicate the camera is powered on, yellow to indicate the camera is in preview mode and sending lower resolution video, and red to indicate that the camera is live and delivering high-resolution video.
  • a blinking red LED indicates a fault with the camera.
  • the power supply (236) uses the energy stored in battery (240) to power the camera.
  • the battery is charged by the power supply (236) from energy transferred by the wireless charging coil (238) which is embedded in the quick disconnect plate at the base of the camera body.
  • Camera (214) connects to and communicates with the Video Mesh Network using standard Wi-Fi 6 protocols and transmits its video frames using the Video Mesh Network Protocol (180) to the nearest assigned Video Mesh Network node (100) which are suspended above the hockey rink.
  • the Video Mesh Network nodes then route the video frames through wired Ethernet cables (108) to a local router (120).
  • the Video Mesh Network nodes (100) are loaded with routing and radio spectrum configuration information via the Video Mesh Network Protocol (180), which directs each node on which resource units (RU) the information is to be transmitted or received during this transmission opportunity (TXOP) time slice, and which traffic should get rerouted to a wired Ethernet connection at that node.
  • the router (120) relays the video frames to the AT 360 image processing workstation (400).
  • the AT 360 image processing system (400) collects the individual frames from multiple cameras and assembles them back into their individual video streams.
  • the frames of these streams are processed through the convolutional neural network (CNN) processors (600) where the images undergo any needed quality adjustments and spatially related frames are merged into a larger composite frame from which the desired view to be broadcast is selected from.
  • CNN convolutional neural network
  • the output from the AT 360 image processing workstation (400) is sent to multiple destinations through Ethernet cable (108) to router (120), where it is distributed to local onsite TV broadcast equipment for over the air broadcasts, streamed to local event viewers (500) through the Video Mesh Network nodes (100) using standard Wi-Fi protocols for maximum compatibility with older mobile devices, and transferred via the Internet (140) to a video transcoding and streaming service provider (106) for distribution to remote viewers through the Internet (140) to their remote viewing devices (130).
  • a method for creating a global time-synchronized shutter mechanism where the shutter of every active camera on the system takes a picture at the exact same moment in time.
  • time-synchronized frames can be used to create a Time Shot.
  • the Realtime Clock (RTC) in the cameras are all synchronized using the Network Time Protocol (NTP). This means that all the cameras have the same time on their RTC and all video frames have timestamps using this synchronized time.
  • NTP Network Time Protocol
  • the video sensors in the cameras are triggered to capture images by the CPU (220).
  • the timing of these triggers is controlled by the RTC in the CPU (220) and by frame timing information from the AT 360 image processing system (400) and delivered using the Video Mesh Network Protocol (180).
  • All video frame timestamps relate to the same moment in time on all operating cameras, even ones not on the same system. In this way, multiple cameras on multiple sites can all have their Time Shot images associated with each other. Systems operating on sites worldwide can do a Time Shot and freeze everything that was happening around the world at that moment in time.
  • the Video Mesh Network Protocol resides inside the data payload of standard Ethernet protocols and VMNP packets can slip seamlessly into and out of standard Internet traffic and streams from different locations can be connected together and incorporated in local broadcasts, as well as participate in global Time Shot moments. Since all the video frames are timestamped with this NTP -based time, they can be organized and synchronized by moments in time.
  • the cameras buffer one-minute segments of video frames in the onboard RAM (222). Along with the buffered video frames is a structure that keeps track of the byte offset to the beginning of each frame. When one minute of video has accumulated, that minute of video is stored on the SD Card (224) and the frame offset information is written to an index file for that video buffer data file.
  • Video Mesh Network Protocol 180
  • Request Data element ID OxOC 180
  • the GPS and target vector form part of the video data stream delivered using the Video Mesh Network Protocol data stream element ID 0x04.
  • a method for taking a native resolution photograph from the action cameras supports the requesting of video data from a camera. Part of that request is a stream ID which represents the camera lens number from 1 to n. A stream ID of 0 indicates that a photograph is requested and the current video frame data is sent uncompressed to the AT360 image processing workstation (400) for processing of the photograph data.
  • VMNP Video Mesh Network Protocol
  • a method for separating the visible spectrum into red, green, and blue bands for processing as individual monochrome images.
  • a trichroic prism assembly is described that separates the full-color image into red, green and blue spectrum bands.
  • This geometry also has the unfortunate side effect of creating an asymmetrical device, which isn’t a problem in larger cameras, but is more problematic when you are creating highly miniaturized cameras. Having to accommodate the elongated one side causes an unnecessary increase in the camera housing size.
  • a symmetrical trichromatic beam splitting module (252) is presented.
  • Full-spectrum light leaving the lens enters the prism (252) and strikes the blue reflective dichroic filter (256) at a 45-degree angle and the image forming blue rays from 440 nm to 480 nm are directed 90 degrees from the original path.
  • the blue rays then strike the silvered mirror (258) at a 45-degree angle and are reflected 90 degrees toward the blue-ray receiving monochrome CMOS sensor (268).
  • the green wavelengths from 480 nm to 580 nm and the red wavelengths from 580 nm to 680 nm pass straight through the blue reflective dichroic filter (256) and strike the red reflective dichroic filter (260) at a 45-degree angle and the red image forming rays are reflected 90 degrees from their original path and strike silvered reflecting mirror (262) at 45-degrees where they are reflected 90 degrees to fall on the red receiving monochrome CMOS sensor (264).
  • the green wavelengths pass through red reflecting dichroic filter (260) and travel straight to the green receiving monochrome CMOS sensor (266).
  • the optical paths followed by the reg, green and blue bands of the spectrum are all equal and cause no change in the arrival of the focused rays at the sensors.
  • the removal of the air gap permits the entire assembly to be made of one solid block which ensures precise alignment of the optical surfaces and strengthens the assembly.
  • the color separating prism uses thin-film optical coatings known as dichroic filters. This type of filter has very sharp cutoffs and can accurately control the bands of wavelengths that are passed.
  • the color filters used in the Bayer pattern sensor chips are by necessity simple plastic film filters with poor cutoff characteristics.
  • the overlap of the color bands causes a loss of purity in color. This is why professional cameras all use 3CMOS optical assemblies.
  • the monochrome sensors used in the 3CMOS module have the same number of pixels that exist in the Bayer pattern sensor. However, in the monochrome sensor, every available pixel captures useful detail information. The three separate monochrome images are then combined together in the FPGA (254) to create a full resolution RGB image.
  • the Bayer pattern sensor needs to capture all three colors with the same number of sensor pixels. To perform this, the sensor uses a pattern of red, green, and blue capture pixels, the number of which corresponds to the sensitivities of the sensor to those wavelengths, and then interpolates the colors for the locations where it doesn’t capture the two other colors. This reduces the detail of the produced images.
  • a Video Mesh Network (800) comprised of multiple Video Mesh Nodes (100), is utilized to provide the transportation and control of the video data coming from the various cameras utilized in the system.
  • FIG. 24 One such possible configuration of the Video Mesh Network (800) is illustrated in FIG. 24.
  • the cameras (200) wirelessly communicate with the Video Mesh Network nodes (100) where the data is routed from node to node using the standard Open Shortest Path First (OSPF) protocol.
  • OSPF Open Shortest Path First
  • the frames travel through the network until exiting at either of the root nodes (100a) or (100b). From there the data travels through the Ethernet cables (108) to the router (120) and on to the video processing workstation (400) which generates the stream to be broadcast.
  • the output stream then travels back to router (120) along the Ethernet cable (108) and from there is send out to any destination on the network or over the Internet to viewers.
  • the Video Mesh Network (800) is based on Wi-Fi 6 technology which provides the base platform for delivering multiple simultaneous streams of high-speed data. Unlike previous generations of Wi-Fi, this 6 th generation was designed specifically to make possible high-capacity wireless networks capable of transferring large volumes of video data.
  • Wi-Fi 6 network The theoretical maximum capacity of a Wi-Fi 6 network is around 10 Gbps, which is many times the capacity of previous generations of Wi-Fi. Despite this native capacity, there are some additionals steps that are needed to make this work with multiple cameras all sending high-bitrate data to a video processing server.
  • Wi-Fi 6 uses lower power for shorter hops from device to device.
  • One benefit of this is less cross-traffic interference between nodes in the network. With longer range transmissions, there are more opportunities for other cells to interfere with each other. In Wi-Fi 6, there is more opportunity to have multiple cells transmitting simultaneously.
  • Wi-Fi 6 makes use of multipath effects to transmit more than one signal at the same time.
  • a sounding frame is first transmitted and captured by the destination node. Analysis of the arriving radio information determines how the multiple paths are affecting this transmission and that information is used to create a data transmission that sends data for different clients over the multiple paths to the receiving station, thus increasing the amount of data that can be sent at once.
  • Wi-Fi 6 subdivides radio channels into smaller resource units that can be assigned to different clients, and multiple clients can send and receive simultaneously during the same transmission opportunity (TXOP) slice, further increasing the volume of data that can be moved through the network.
  • TXOP transmission opportunity
  • Wi-Fi 6 uses a mechanism known as Multi- User Orthogonal Frequency Division Multiple Access (MU-OFDMA) to subdivide the radio spectrum into smaller frequency allocations, called resource units (RUs) which permit the network node to synchronize communications (uplink and downlink) with multiple individual clients assigned to specific RUs.
  • MU-OFDMA Multi- User Orthogonal Frequency Division Multiple Access
  • RUs resource units
  • This simultaneous transmission cuts down on excessive overhead at the medium access control (MAC) sublayer, as well as medium contention overhead.
  • MAC medium access control
  • the network node can allocate larger portions of the radio spectrum to a single user or partition the same spectrum to serve multiple clients simultaneously. This leads to better radio spectrum use and increases the efficiency of the network.
  • RUs alone isn’t enough to manage the high volume of traffic in a video capture network and the Wi-Fi 6 radio module can use some additional guidance on how to divide up the spectrum to more efficiently transfer all the video data. This is accomplished using the Video Mesh Network Protocol (VMNP) (180) and the Group element.
  • VMNP Video Mesh Network Protocol
  • the VMNP group element lets the system dedicate portions of the radio spectrum to a data group that is used for a specific purpose. This locks a portion of the radio spectrum so that it is always used by this group and isn’t part of the general spectrum that the Wi-Fi 6 divides up on a TXOP basis to handle the random traffic that comes through the network.
  • the groups let you dedicate data pipes to specific tasks, such as a data backbone to transmit the aggregated video traffic from multiple clients accumulating from various network nodes into a wider bandwidth path that is always available.
  • the radio spectrum to be designated to the group is specified by turning on specific bits in the VMNP group element. Bits 0 - 36 control the radio spectrum allocation. Bits 37 - 53 control the RU allocation at the 20 Mhz channel level as illustrated in FIG. 28.
  • the division of radio spectrum, the subdivision of spectrum into resource units, and the control of spectrum and RUs into groups for data pipe creation provides the needed control to improve the efficiency of the data transmission through the network.
  • Video frames when video frames are encoded for transmission, they generate a series of compressed frames of different types and sizes. There are Intra Frames which occur at regular intervals, such as once every 60 frames for a video produced at 60 fps, and there are Inter Frames between these Intra Frames, which contain highly compressed data that only encodes what has changed from previous frames.
  • the Intra Frames don’t utilize any predictive compression and as such are considerably larger in size than the Inter Frames.
  • each camera would start its I-frame 6 frames apart from the other cameras.
  • This video interleaving is part of the VMNP and prevents too many clients from trying to send large payloads over the network at random and in competition with each other.
  • the smaller Inter Frames are much easier to manage once the larger I-frames have been organized into a predictable traffic pattern on the network.
  • W-Fi 6 has another speed enhancement feature.
  • This is Multi-User Multi- Input Multi-Output (MU-MIMO), which is the ability to send multiple data streams on multiple antennas, on the same frequency and at the same time. This is effective for large data pipes and sending large blocks of data from one network node to another.
  • MU-MIMO Multi-User Multi- Input Multi-Output
  • the Video Mesh Network uses OFDMA to handle data transfer from all the clients to the Video Mesh Network Nodes, and then MEMO for transmission of large blocks of data between the network nodes as it is routed through the network.
  • the Wi-Fi 6 radios make use of beamforming when using multiple antennas with MEMO. This is where the signal to the antennas are phase controlled in such a way that the output of the antenna array can be focused in a particular direction toward a target receiver. This increases the range of the transmission and minimizes the interference with other radio units that are talking on other portions of the network at the same time.
  • the Video Mesh Network Node (100) has three antenna groups. There is a 5.8 GHz eight antenna array (172) which permits eight different beamformed transmissions at the same time on the same frequency which is referred to as 8x8x8 MIMO. A 2.4 GHz four antenna array (174) which permits four different beamformed transmissions at the same time on the same frequency which is referred to as 4x4x4 MIMO, and a GPS receiver antenna (168).
  • the Video Mesh Network Node (100) also has an Ethernet port (176) which permits the transfer of data to and from a wired Ethernet connection.
  • Ethernet port (176) which permits the transfer of data to and from a wired Ethernet connection.
  • the network traffic can be shifted to a standard Ethernet network as needed, either to get around obstructions to the Wi-Fi transmissions or to shift the traffic to a wider area physical network.
  • the physical network traffic enters or leaves on Ethernet module (176) and is transferred by CPU (162) to RAM (164) and the Wi-Fi 6 module (170) sends and receives wireless network traffic in exchange with the CPU (162).
  • the CPU (162) handles all the Wi-Fi 6 protocols as well as the OSPF routing protocol. It also processes requests from the Video Mesh Network Protocol (180) for statistics, data group control, and routing preferences for groups.
  • the Wi-Fi 6 radios adjust their power to minimize interference and maximize throughput.
  • Local clients talking to a network node use OFDMA and low power to maximize the number of simultaneous users on the node. Then switch to MIMO and beamforming at higher power to send the data from network node to network node for maximum data transfer rates.
  • the combination of power control, spectrum subdivision, and simultaneous directional data transfers on the same frequencies enable the network to transfer significant amounts of information at the same time all over the wireless mesh network.
  • the organization provided by the interleaved video transmission and organization of the video transmission paths using the Video Mesh Network Protocol (VMNP) (180) optimizes the network for maximum capacity and efficiency.
  • VMNP Video Mesh Network Protocol
  • the VMNP 180
  • the CPU (162) communicates with GPS (168) for both location information and for time synchronization.
  • the Video Mesh Network Nodes (100) act as a clock source for the Network Time Protocol (NTP) which is used to synchronize the RTC and shutters in cameras that don’t have their own GPS clock source.
  • NTP Network Time Protocol
  • the Video Mesh Network Protocol (VMNP) (180) is used to manage the data flowing through the Video Mesh Network (800).
  • the VMNP follows that standard layout of an OSI layer and consists of a Protocol Data Unit (PDU) and a Service Data Unit (SDU).
  • PDU Protocol Data Unit
  • SDU Service Data Unit
  • the VMNP PDU encapsulates the SDU, consisting of a variable payload length of 42 to 1946 Octets.
  • This payload data is filled with one or more VMNP elements that identify different types of actions and their associated data formats.
  • the last element in the payload area is always VMNP element ID 0x00, or the END element.
  • the VMNP (180) fits into the standard OSI networking model in the following manner.
  • the Physical layer PDU payload encapsulates the MAC layer PDU.
  • the MAC SDU payload encapsulates the VMNP PDU.
  • the VMNP SDU encapsulates the various VMNP elements in the SDU payload.
  • the VMNP (180) controls all the features of and cameras connected to the Video Mesh Network as well as configuring the resources of the Video Mesh Network Nodes (100).
  • the VMNP (180) consists of twelve unique elements each with a corresponding one octet ID field.
  • the various VMNP (180) elements perform the following functions.
  • 0x00 signifies the end of a collection of VMNP (180) elements and causes the cessation of parsing the payload data any further for additional elements.
  • 0x01 (Camera Config) sets the specified camera configuration parameters.
  • 0x02 (Camera Mode) sets the current mode of operation of the camera.
  • Modes include:
  • Video Config sets the video operation including video encoder settings, resolutions, and bit rates.
  • 0x04 (Data Stream) a block of data from a video or photograph.
  • 0x05 (Camera Config) returns the current camera configuration.
  • 0x06 (Camera Mode) Returns the current camera mode of operation.
  • 0x07 (Video Config) Returns the current video encoder configuration.
  • 0x08 (Assign Group) Assigns this camera to one of the data pipe groups.
  • Group Config Configures the network node to use specific spectrum and resource units for all traffic assigned to this group ID.
  • OxOA Requests statistics from a camera.
  • OxOB (Statistics) Requests statistics from a network node.
  • OxOC Requests data from a camera.
  • Stream ID 0 indicates to take a photograph in native resolution and return the data.
  • a stream ID equal to a lens number plus a time range retrieves video for that lens from that time range.
  • the individual camera resolutions and bitrates are adjustable via the VMNP for various image sizes and quality using the Video Config element ID 0x03.
  • Each camera has a high-resolution code, field HR CODE for live video and a low-resolution code, field LR CODE for preview video.
  • Table 1 lists the various resolution codes and their associated resolutions, bitrates and encoding levels. The highest possible image quality is produced with settings for what is known as transparent encoded video. This is a video with little or no compression artifacts and produces large amounts of data which is customarily used for a master recording that is suitable for editing without degrading the video quality.
  • the VMNP (180) provides a compact structure to embed a variety of data into standard Ethernet protocols for easy transport over any network, and the interleaved video and Wi-Fi 6 spectrum controls provide the needed organizational additions to make large- scale video capture in a distributed wireless environment possible.
  • the images are processed using the AI-360 image processing workstation (400).
  • the AI-360 processor uses a Convolutional Neural Network (CNN) based approach to standard image processing techniques.
  • CNN Convolutional Neural Network
  • CNN Convolutional Neural Networks
  • These CNNs are a type of artificial neural network that is used for image recognition and processing and employ deep learning to create mappings of input images to output images that have some form of image processing performed on them.
  • the output of the CNN is a homography matrix that transforms the images to their correct location and shape in the output mosaic of images.
  • the CNN maps a distorted input image to an undistorted output image.
  • CNNs are used to align an image with the horizon, turn a low-resolution image into a higher resolution image and correct color and exposure among other treatments.
  • the CPU (401) handles exchanging images with the CNNs (600) as well as directing the entire chain of processing of the inbound data streams into processed video output streams.
  • the CPU (401) receives video frame data packets from all the cameras via Ethernet module (405). This data is organized and sorted back into the individual data streams for processing.
  • the encoded frames are decoded by the video decoder (406) and stored in RAM (403). Once all the frames for the current video slice are ready, they are sent into the image processor (600).
  • the CNNs can be arranged in many different ways and have different processing abilities. Referring now to FIG. 38, one possible workflow arrangement is the current frames are input into the image processing engine (901) and a determination is made if this is the first frame of a new video sequence or a continuation of an existing sequence (902).
  • this next CNN takes the distorted images and corrects them to minimize these distortions and produce a higher quality image (909).
  • the next CNN corrects imbalances of color or exposure in different areas of the virtual scene to make a more unified composited image (910).
  • a large virtual image to work with it is determined if there are any objects being tracked (912). If there are, they are separated into the foreground (tracked objects) and background (the rest of the image) to facilitate tracking (913). The objects are then tracked through multiple frames of video (914).
  • the targeting information for any tracked objects is available along with any manual target points selected by the operator, and the CNN now composes final image frames out of the larger virtual scene using standard image composition rules (915).
  • the output frames are read by the CPU (401) where they are sent to the video card (404) for display on the workstation screen.
  • the CPU also sends the frames to the video encoder (407) where they are compressed and returned to the CPU for distribution or forwarded to TV broadcasting equipment via the 3G SDI interface (408).
  • the ability of the AT 360 image processor to generate virtual camera views and track objects creates the ability to produce Video Threads. These are sequences of video created by putting together views from multiple cameras as they track a subject moving around them. A group of hockey players wearing camera (214), would produce a collection of 360-degree videos of a scene from all sorts of different camera angles.
  • the AI-360 image processor can track a subject of interest, a hockey player, as that player moves between the other players on the ice.
  • the 360-degree camera feeds from the players permits the AT360 image processor to create virtual views, following image composition guidelines, and build video sequences that follow the player’s movements as though there was a camera on the ice following the action. This process can be directed by an operator to produce a replay highlight video segment.
  • All the CNNs need to be trained to perform their functions. This is referred to as deep learning and essentially consists of presenting the neural networks with input images and reference output images.
  • the CNN systematically applies various solution attempts to try and turn the input solution into the output solution. Over time, a collection of solutions presents themselves for various input conditions and the images can be processed quite rapidly. What can take multiple seconds to process using algorithms and CPU time, a CNN can accomplish in milliseconds.
  • Some image processing tasks can use simpler training setups where a less than optimal input image is compared to an optimal output image and the data set created for the training is straight forward to produce.
  • One such challenge is a surround camera system where a circle of cameras captures a target area from multiple angles, and the AT 360 image processor (600) has to create a seamless stitched view from any angle.
  • FIG. 33 a training mechanism for a surround camera system that is suitable for generating the large training data sets needed is illustrated.
  • Platform (950) has a pair of manikin targets, (951) and (952), possed on it.
  • the platform (950) rotates at one revolution per minute.
  • There are two fixed cameras (960) and (970) which are located at the normal positions of cameras at one evenly spaced segment of the circle surrounding the subject area.
  • a reference camera (962) is moved in one-degree steps from position (960) to (970) with each step happening upon the completion of each one revolution of the target platform.
  • the fixed cameras and the reference camera are capturing video at 60 fps while the platform (950) is rotating and the reference camera (962) is stepping from fixed position (960) to fixed position (970).
  • the sweep captures all the relative one-degree positions between the two fixed camera locations, and since the camera locations are evenly spaced around the circle all such segments have the same relationship to each other. This way, the reference camera (962) only needs to capture the relative positions for one slice of the circle.
  • the one sweep of camera (962) will capture all possible angles of the targets from all possible positions. This produces a large sample data set of reference images to train the stitching attempts from various virtual camera positions between the start and endpoints.
  • a typical failing of camera systems that look to stitch images together is the parallax error created between the two cameras with objects that are close to the cameras. Distant areas of the scene stitch together with little distortion, but objects that are close to the cameras produce widely differing images that are problematic to stitch together.
  • Some camera systems just try to blend the images together at the seams, which produces ghosts and other visual artifacts in the image. Others will select one image or the other and use all the content from the selected image, producing a resulting image with varying degrees of success at stitching the images.
  • the AI ⁇ 360 solution uses CNNs which are quite adept at identifying which pixels are closer to the camera and which are farther away. This is accomplished by analyzing the relative pixel motion between the two images. Armed with the knowledge of which pixels are closer to the camera, and a reference image of what the scene actually looks like from the desired virtual position, the CNN can learn how to incorporate the potions of each image to produce a stitched image of how the scene would look at that virtual location.
  • a variation of this problem is presented by the surround system utilized in the MMA fighting ring system.
  • the surround system utilized in the MMA fighting ring system.
  • the reference camera (990) Using a mechanism similar to the one used in FIG. 33, and moving the reference camera (990) between the fixed position cameras (980) and (988), the reference camera (990) will capture reference images from all the positions that include both parallel cameras and cameras that are aligned at angles with each other.
  • This data set trains the CNN on how to handle creating a virtual image from various positions around the ring and handling the mixture of aligned and angled cameras.
  • the AT 360 image processor can also stitch together images from cameras that have no fixed relationships or constraints on their orientation.
  • FIG. 37 depicts a number of sample images taken by cameras with different combinations of alignment relative to each other, and a reference image of how the scene appears from a real camera at that position.
  • a data set is produced to train the CNN to generate stitching solutions to a wider range of relative camera positions and angles.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Studio Devices (AREA)

Abstract

L'invention concerne un système et un procédé pour créer une capture vidéo de caméra d'action en direct, une commande, un réseau vidéo et une production de diffusion, comprenant de préférence au moins une caméra d'action en direct pour capturer la vidéo, un réseau sans fil basé sur Wi-Fi 6 et un système de traitement d'image à réseau neuronal d'intelligence artificielle pour préparer la vidéo pour la diffusion.
PCT/CA2021/050100 2020-01-29 2021-01-29 Caméra d'action en direct, commande, capture, routage, traitement et système et procédé de diffusion WO2021151205A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202062967180P 2020-01-29 2020-01-29
US62/967,180 2020-01-29

Publications (1)

Publication Number Publication Date
WO2021151205A1 true WO2021151205A1 (fr) 2021-08-05

Family

ID=77078573

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2021/050100 WO2021151205A1 (fr) 2020-01-29 2021-01-29 Caméra d'action en direct, commande, capture, routage, traitement et système et procédé de diffusion

Country Status (1)

Country Link
WO (1) WO2021151205A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113949694A (zh) * 2021-10-15 2022-01-18 保升(中国)科技实业有限公司 一种基于视频ai计算和大数据分析底层生态环境系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108260023A (zh) * 2018-04-20 2018-07-06 广州酷狗计算机科技有限公司 进行直播的方法和装置
WO2018213481A1 (fr) * 2017-05-16 2018-11-22 Sportscastr.Live Llc Systèmes, appareil et procédés de visualisation évolutive à faible latence de flux de diffusion intégrés de commentaire et de vidéo d'événement pour des événements en direct, et synchronisation d'informations d'événements avec des flux visualisés par l'intermédiaire de multiples canaux internet
CN109660817A (zh) * 2018-12-28 2019-04-19 广州华多网络科技有限公司 视频直播方法、装置及系统
WO2019128592A1 (fr) * 2017-12-29 2019-07-04 广州酷狗计算机科技有限公司 Procédé et appareil de diffusion en direct
CN112073739A (zh) * 2020-08-18 2020-12-11 深圳锐取信息技术股份有限公司 一种移动录播控制系统及方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018213481A1 (fr) * 2017-05-16 2018-11-22 Sportscastr.Live Llc Systèmes, appareil et procédés de visualisation évolutive à faible latence de flux de diffusion intégrés de commentaire et de vidéo d'événement pour des événements en direct, et synchronisation d'informations d'événements avec des flux visualisés par l'intermédiaire de multiples canaux internet
WO2019128592A1 (fr) * 2017-12-29 2019-07-04 广州酷狗计算机科技有限公司 Procédé et appareil de diffusion en direct
CN108260023A (zh) * 2018-04-20 2018-07-06 广州酷狗计算机科技有限公司 进行直播的方法和装置
CN109660817A (zh) * 2018-12-28 2019-04-19 广州华多网络科技有限公司 视频直播方法、装置及系统
CN112073739A (zh) * 2020-08-18 2020-12-11 深圳锐取信息技术股份有限公司 一种移动录播控制系统及方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113949694A (zh) * 2021-10-15 2022-01-18 保升(中国)科技实业有限公司 一种基于视频ai计算和大数据分析底层生态环境系统

Similar Documents

Publication Publication Date Title
US10484652B2 (en) Smart headgear
CN108495048B (zh) 基于云台控制的双摄像头图像采集设备
CN107948577A (zh) 一种全景视讯会议的方法及其系统
KR20100073079A (ko) 동기화된 다중 영상 취득을 위한 다중 카메라 제어 및 영상저장 장치 및 방법
US20080024594A1 (en) Panoramic image-based virtual reality/telepresence audio-visual system and method
US20040013192A1 (en) Mobile live information system
CN105264876A (zh) 低成本电视制作的方法及系统
CN108650494B (zh) 基于语音控制的可即时获取高清照片的直播系统
JP5942258B2 (ja) 映像表示システム、映像合成再符号化装置、映像表示装置、映像表示方法、及び映像合成再符号化プログラム
CN112601033B (zh) 云转播系统及方法
CN205510277U (zh) 一种无人机全景图像传输设备
US20130111051A1 (en) Dynamic Encoding of Multiple Video Image Streams to a Single Video Stream Based on User Input
KR101446995B1 (ko) 멀티앵글영상촬영헬멧 및 촬영방법
CN108200394A (zh) 一种支持多路图像传输的无人机系统
CN108650522A (zh) 基于自动控制的可即时获取高清照片的直播系统
WO2021151205A1 (fr) Caméra d'action en direct, commande, capture, routage, traitement et système et procédé de diffusion
CN105812640A (zh) 球型全景摄像装置及其视频图像传输方法
CN108696724A (zh) 可即时获取高清照片的直播系统
RU2758501C1 (ru) Система, устройство и способ трансляции и приема контента в реальном времени с носимых устройств с управляемой задержкой и поддержанием качества контента
JP4250814B2 (ja) 3次元映像の送受信システム及びその送受信方法
KR20190032670A (ko) 다시점 카메라를 이용한 실감 영상 서비스 제공 시스템
Qvarfordt et al. High quality mobile XR: Requirements and feasibility
Mademlis et al. Communications for autonomous unmanned aerial vehicle fleets in outdoor cinematography applications
Domański et al. Experiments on acquisition and processing of video for free-viewpoint television
US20200234500A1 (en) Method, device and computer program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21747548

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 23-11-2022)

122 Ep: pct application non-entry in european phase

Ref document number: 21747548

Country of ref document: EP

Kind code of ref document: A1