WO2018222532A1 - Suivi d'objets d'intérêt par objectif zoom sur la base d'une identification par radiofréquence (rfid) - Google Patents

Suivi d'objets d'intérêt par objectif zoom sur la base d'une identification par radiofréquence (rfid) Download PDF

Info

Publication number
WO2018222532A1
WO2018222532A1 PCT/US2018/034653 US2018034653W WO2018222532A1 WO 2018222532 A1 WO2018222532 A1 WO 2018222532A1 US 2018034653 W US2018034653 W US 2018034653W WO 2018222532 A1 WO2018222532 A1 WO 2018222532A1
Authority
WO
WIPO (PCT)
Prior art keywords
camera
selected object
tracked
video
wirelessly
Prior art date
Application number
PCT/US2018/034653
Other languages
English (en)
Inventor
Kumar Ramaswamy
Jeffrey Allen Cooper
Original Assignee
Vid Scale, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vid Scale, Inc. filed Critical Vid Scale, Inc.
Publication of WO2018222532A1 publication Critical patent/WO2018222532A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • G06V10/235Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on user input or interaction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/62Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/10Recognition assisted with metadata

Definitions

  • images are captured by lenses with different optical parameters, such as different optical zoom factors. Different camera optics affect the quality and appearance of the images captured by the cameras. These images may undergo further digital enhancement, such as digitally changing the zoom and field of view (FOV) parameters.
  • FOV field of view
  • Some video cameras are equipped with multiple lens configurations.
  • the video cameras may be realized as a stand-alone video camera, a smartphone, a tablet device, or the like.
  • the multiple lens configurations can provide for both a normal and a zoom-capture capability.
  • an iPhone7+ mobile phone is equipped with two lenses, a 28 mm equivalent lens with a f/1.8 optical setting and a 56 mm equivalent lens with a f/2.8 optical setting. These two lenses allow for a 2x optical zoom.
  • Some video cameras include a swivel-camera capability.
  • a video camera is motorized to move.
  • the swivel-equipped video camera may be a second video lens that has an optical zoom greater than a first video lens on the same video camera.
  • Radio-frequency identification (RFID) tags are increasingly being used to track the location of different objects of interest. For example, athletes and balls may be tracked with RFID tags in the context of a professional sporting event.
  • RFID tag and its accompanying RFID location processing system provide for precise location tracking referenced to a standard set of coordinate systems.
  • the tracking data also provides a continuous update of the object's location, speed of movement, identity and the like.
  • a video tracking method is performed that allows for tracking of one or more objects in video even if the objects are occluded or otherwise unavailable for optical tracking methods.
  • video of a scene is captured with a camera, e.g. a camera on a wireless handheld user device such as a smartphone.
  • a selected object in the captured video is optically tracked to determine an optically-tracked location within the captured video.
  • the optically-tracked location within the video may be expressed, for example, as pixel coordinates within the video.
  • the user device determines a position and orientation of the camera.
  • the user device also wirelessly receives coordinates that indicate the position of the selected object.
  • These coordinates may have been generated by, for example, tracking an RFID tag on the selected object. Based on the position and orientation of the camera, the received coordinates of the selected object are mapped to a mapped location in the captured video, which may be represented by pixel coordinates. In response to a determination that the selected object is obscured in the captured video, the mapped location is used to track the selected object.
  • Some embodiments operate to identify the selected object that is being optically tracked.
  • the user device receives, for each of a plurality of wirelessly-tracked objects, respective coordinates indicating a position of the wirelessly-tracked object. Based on the position and orientation of the camera, the user device maps the received coordinates of the wirelessly-tracked objects to respective mapped locations in the captured video. The user device compares the optically-tracked location of the selected object to the mapped locations of the wirelessly-tracked objects to determine which wirelessly- tracked object corresponds to the selected object. In situations in which coordinates are received for multiple objects, such an identification method may be used to determine which coordinates should be used for tracking in case the particular selected object becomes obscured.
  • the identity of the selected object may be determined as follows. For each of a plurality of wirelessly-tracked objects, the user device receives (i) respective coordinates indicating a position of the wirelessly-tracked object and (ii) a respective identifier of the wirelessly-tracked object. Based on the position and orientation of the camera, the device maps the received coordinates of the wirelessly-tracked objects to respective mapped locations in the captured video. The device compares the optically-tracked location of the selected object to the mapped locations of the wirelessly-tracked objects to determine which wirelessly-tracked object corresponds to the selected object. The device then associates the identifier of the determined wirelessly-tracked object with the selected object. In some embodiments, the identifier associated with the selected object is displayed together with the captured video on the user device. The identifier may be displayed at or near the respective mapped location in the captured video.
  • the user device has at least two cameras: a first camera used for the optical tracking and a second camera on a swivel mount.
  • a method as described above may operating the swivel mount to direct the second camera in line with the optically-tracked location, e.g. such that the field of view of the second camera is substantially centered on the optically-tracked location.
  • the swivel mount may be operated to direct the second camera in line with the mapped location, e.g. such that the field of view of the second camera is substantially centered on the mapped location.
  • the wirelessly-received coordinates may be coordinates generated from RFID-based position information or from GPS-based position information, among other alternatives.
  • the determination of the position and orientation of the camera is performed by a method that includes calibrating the position and orientation against predetermined locations in the scene that are captured by the camera.
  • the position and orientation of the camera may be updated using motion sensors on the user device.
  • the user device displays at least a portion of the captured video and also displays an indicator of the selected object at the optically-tracked location.
  • the user device displays the indicator of the selected object at the mapped location.
  • the indicator may be, for example, an identifier of the selected object and/or a rectangle or other shape around the selected object.
  • Some embodiments operate to provide a zoomed view of the selected object by cropping and zooming a portion of the captured video that includes the optically-tracked location.
  • the device switches to cropping and zooming a portion of the captured video that includes the mapped location.
  • devices that include a camera, a wireless interface, a processor, and a non-transitory computer-readable storage medium storing instructions operative to perform any of the functions described herein.
  • a video recording device comprises two different lenses, one with a wide-field of view and a second one with a narrow-field of view. The position of an object being captured by the cameras is determined with an RFID- based locating system.
  • video with the wide-field of view camera can be displayed with mapped metadata corresponding to the tracked objects being overlaid on the video.
  • a user may select a zoomed-in view of the video around one of the tracked objects.
  • Video from the narrow field of view camera having the selected tracked object is then displayed.
  • One embodiment takes the form of a method comprising: imaging a scene with a first camera having a wide field of view; tracking a location of an object in the image from the first camera; imaging the scene with a second camera having a narrow field of view; responsive to determining that the object is within the view of the second camera, outputting the video obtained from the second camera; and responsive to determining that the object is not within the view of the second camera, cropping the images of the first camera to contain the object, upscaling the cropped video, and outputting the cropped and up-scaled video.
  • One embodiment takes the form of a method comprising: capturing a first video of a scene with a wide field of view (W-FOV) camera; receiving position information and metadata information for a tracked object within the scene; mapping the position information to the tracked object within the scene; displaying the metadata information on a display of the first video based on the mapped position; receiving a video- tracking request for the tracked object; capturing a second video of a portion of the scene with a narrow field of view (N-FOV) camera having the tracked object; and displaying the second video.
  • W-FOV wide field of view
  • Another embodiment takes the form of a method comprising: capturing a first image of a scene with a first camera having a wide field of view; receiving object position information for a tracked object; determining the position and orientation of the first camera in relation to a coordinate system used for the object position information; mapping the object position information into the coordinates of the first camera; detecting the tracked object in the first image; fusing information from the object position with the detected tracked objects; identifying the tracked objects in the first image from the first camera using the object position information; overlaying a description of the tracked object on the first image of the first camera provided to a display; and responsive to receiving an indication of a selection of the tracked object: capturing a second image of the scene from a second camera having a narrow field of view; stabilizing the video of the first and second cameras; extracting images from both the first camera's video and from the second camera's video; selecting a source for a zoom view between the first and second cameras; and displaying the image from the selected source.
  • FIG. 1A is a functional block diagram of a wireless transmit/receive unit that may be used in some embodiments.
  • FIG. 1 B is a functional block diagram of a network entity that may be used in some embodiments.
  • FIG. 2 depicts a system architecture for use of RFID information with a video camera system, in accordance with an embodiment.
  • FIG. 3 is an example sequence diagram for RFID-based tagging and tracking of objects of interest in camera views, in accordance with an embodiment.
  • FIG. 4 depicts a video capture device displaying RFID information overlaid on a video, in accordance with an embodiment.
  • FIGs. 5A-C depict a series of video frames showing an optical occlusion, in accordance with an embodiment.
  • FIG. 6 is a flow chart of an object tracking method in accordance with an embodiment.
  • FIG. 7 is a flow chart of an object tracking method an accordance with another embodiment.
  • modules that carry out (i.e., perform, execute, and the like) various functions that are described herein in connection with the respective modules.
  • a module includes hardware (e.g., one or more processors, one or more microprocessors, one or more microcontrollers, one or more microchips, one or more application-specific integrated circuits (ASICs), one or more field programmable gate arrays (FPGAs), one or more memory devices) deemed suitable by those of skill in the relevant art for a given implementation.
  • ASICs application-specific integrated circuits
  • FPGAs field programmable gate arrays
  • Each described module may also include instructions executable for carrying out the one or more functions described as being carried out by the respective module, and it is noted that those instructions may take the form of or include hardware (i.e., hardwired) instructions, firmware instructions, software instructions, and/or the like, and may be stored in any suitable non-transitory computer-readable medium or media, such as commonly referred to as RAM, ROM, etc.
  • Some embodiments disclosed herein are implemented using one or more wired and/or wireless network nodes, such as a wireless transmit/receive unit (WTRU), an RFID tag, a video camera, or other network entity.
  • WTRU wireless transmit/receive unit
  • RFID tag RFID tag
  • video camera or other network entity.
  • FIG. 1A is a system diagram of an exemplary WTRU 102, which may be employed as a client device, video capture device, or other components in embodiments described herein.
  • the WTRU 102 may include a processor 118, a communication interface 119 including a transceiver 120, a transmit/receive element 122, a speaker/microphone 124, a keypad 126, a display/touchpad 128, a nonremovable memory 130, a removable memory 132, a power source 134, a global positioning system (GPS) chipset 136, and sensors 138.
  • GPS global positioning system
  • the processor 118 may be a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Array (FPGAs) circuits, any other type of integrated circuit (IC), a state machine, and the like.
  • the processor 118 may perform signal coding, data processing, power control, input/output processing, and/or any other functionality that enables the WTRU 102 to operate in a wireless environment.
  • the processor 118 may be coupled to the transceiver 120, which may be coupled to the transmit/receive element 122. While FIG. 1A depicts the processor 118 and the transceiver 120 as separate components, it will be appreciated that the processor 118 and the transceiver 120 may be integrated together in an electronic package or chip.
  • the transmit/receive element 122 may be configured to transmit signals to, or receive signals from, a base station over the air interface 116.
  • the transmit/receive element 122 may be an antenna configured to transmit and/or receive RF signals.
  • the transmit/receive element 122 may be an emitter/detector configured to transmit and/or receive IR, UV, or visible light signals, as examples.
  • the transmit/receive element 122 may be configured to transmit and receive both RF and light signals. It will be appreciated that the transmit/receive element 122 may be configured to transmit and/or receive any combination of wireless signals.
  • the WTRU 102 may include any number of transmit/receive elements 122. More specifically, the WTRU 102 may employ MIMO technology. Thus, in one embodiment, the WTRU 102 may include two or more transmit/receive elements 122 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interface 116.
  • the transceiver 120 may be configured to modulate the signals that are to be transmitted by the transmit/receive element 122 and to demodulate the signals that are received by the transmit/receive element 122.
  • the WTRU 102 may have multi-mode capabilities.
  • the transceiver 120 may include multiple transceivers for enabling the WTRU 102 to communicate via multiple RATs, such as UTRA and IEEE 802.11 , as examples.
  • the processor 118 of the WTRU 102 may be coupled to, and may receive user input data from, the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128 (e.g., a liquid crystal display (LCD) display unit or organic light-emitting diode (OLED) display unit).
  • the processor 118 may also output user data to the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128.
  • the processor 118 may access information from, and store data in, any type of suitable memory, such as the non-removable memory 130 and/or the removable memory 132.
  • the non-removable memory 130 may include random-access memory (RAM), read-only memory (ROM), a hard disk, or any other type of memory storage device.
  • the removable memory 132 may include a subscriber identity module (SIM) card, a memory stick, a secure digital (SD) memory card, and the like.
  • SIM subscriber identity module
  • SD secure digital
  • the processor 118 may access information from, and store data in, memory that is not physically located on the WTRU 102, such as on a server or a home computer (not shown).
  • the processor 118 may receive power from the power source 134, and may be configured to distribute and/or control the power to the other components in the WTRU 102.
  • the power source 134 may be any suitable device for powering the WTRU 102.
  • the power source 134 may include one or more dry cell batteries (e.g., nickel-cadmium (NiCd), nickel-zinc (NiZn), nickel metal hydride (NiMH), lithium- ion (Li-ion), and the like), solar cells, fuel cells, and the like.
  • the processor 118 may also be coupled to the GPS chipset 136, which may be configured to provide location information (e.g., longitude and latitude) regarding the current location of the WTRU 102.
  • location information e.g., longitude and latitude
  • the WTRU 102 may receive location information over the air interface 116 from a base station and/or determine its location based on the timing of the signals being received from two or more nearby base stations. It will be appreciated that the WTRU 102 may acquire location information by way of any suitable location-determination method while remaining consistent with an embodiment.
  • the processor 118 may further be coupled to other peripherals 138, which may include one or more software and/or hardware modules that provide additional features, functionality and/or wired or wireless connectivity.
  • the peripherals 138 may include sensors such as an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module, an Internet browser, and the like.
  • sensors such as an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module
  • FIG. 1 B depicts an exemplary network entity 190 that may be used in embodiments of the present disclosure, for example as an element in the RFID locating system, a video distribution system, or client device as described herein.
  • network entity 190 includes a communication interface 192, a processor 194, and non-transitory data storage 196, all of which are communicatively linked by a bus, network, or other communication path 198.
  • Communication interface 192 may include one or more wired communication interfaces and/or one or more wireless-communication interfaces. With respect to wired communication, communication interface 192 may include one or more interfaces such as Ethernet interfaces, as an example. With respect to wireless communication, communication interface 192 may include components such as one or more antennae, one or more transceivers/chipsets designed and configured for one or more types of wireless (e.g., LTE) communication, and/or any other components deemed suitable by those of skill in the relevant art. And further with respect to wireless communication, communication interface 192 may be equipped at a scale and with a configuration appropriate for acting on the network side— as opposed to the client side— of wireless communications (e.g., LTE communications, Wi-Fi communications, and the like). Thus, communication interface 192 may include the appropriate equipment and circuitry (perhaps including multiple transceivers) for serving multiple mobile stations, UEs, or other access terminals in a coverage area.
  • wireless communication interface 192 may include the appropriate equipment and circuitry (perhaps including multiple transceivers)
  • Processor 194 may include one or more processors of any type deemed suitable by those of skill in the relevant art, some examples including a general-purpose microprocessor and a dedicated DSP.
  • Data storage 196 may take the form of any non-transitory computer-readable medium or combination of such media, some examples including flash memory, read-only memory (ROM), and random- access memory (RAM) to name but a few, as any one or more types of non-transitory data storage deemed suitable by those of skill in the relevant art could be used.
  • data storage 196 contains program instructions 197 executable by processor 194 for carrying out various combinations of the various network-entity functions described herein.
  • FIG. 2 depicts a system architecture for use of RFID information with a video camera system, in accordance with an embodiment.
  • the system architecture of FIG. 2 may be used in both fixed-camera systems as well as camera systems that are realized with a mobile phone, tablet, or other non-fixed camera devices.
  • the camera's location and orientation is mapped to the standard-referenced RFID information.
  • the RFID location information of the tracked objects is referenced to a predetermined or otherwise standardized geometry of the RFID tracking area, such as the stadium or sports arena.
  • the camera location and orientation is determined through an initial calibration procedure that comprises pointing the camera at one or more predetermined locations in the RFID tracking area.
  • Another method used in some embodiments to determine a camera's location and orientation includes using wireless radio signal ranging techniques.
  • the initial location and orientation information can then be updated with data from the camera's internal sensors, such as the camera's motion sensors (e.g. an inertial measurement unit).
  • both a Wide-Field of View (W-FOV) camera 202 and a Narrow-Field of View (N-FOV) camera 204 are used to capture video.
  • the cameras are generally oriented in the same direction, resulting in the N-FOV camera 204 capturing a portion of the image captured by the W-FOV camera 202.
  • either one or both of the cameras 202 and 204 have configurable optical settings, such as zoom and depth of field.
  • either one or both of the cameras are swivel-mounted, with the direction of view able to be mechanically changed by changing the orientation of the swivel-mounted camera.
  • the output video from the cameras is decompressed by the respective decompression modules 206 and 208.
  • either one or both of the cameras 202 and 204 output decompressed video, and the respective decompression module(s) are omitted.
  • the W-FOV Decompression Module 206 provides decompressed video to the RFID Mapping Module 210 and the Crop, Alignment, and Upscale Module 212.
  • the RFID Mapping Module 210 receives frame-level RFID data 214 in real-time. Based on the position and orientation of the W-FOV image, an orthographic mapping is performed between the RFID data 214 and the view of the W-FOV camera 202. An identity of the object of interest associated with the tracked RFID tag may then be displayed on the video screen, based on the mapping. At element 216, a user may then select an object of interest to track. In other embodiments, the object-of-interest is selected via an automated process.
  • the RFID Mapping Module 210 provides the location of the user-selected object of interest within the field of view to the N-FOV Crop Region Module 218.
  • the N-FOV Crop Region Module 218 determines information to track the object of interest in the N-FOV camera view. This determination is based on the orientation information of the W-FOV camera viewpoint, the RFID location information, and the user-selected object of interest.
  • the cropping information is continually fed to the crop module 212, for example every frame, every GOP, or ever set time period.
  • the decompressed video from the N-FOV camera 204 is provided to the Time Alignment and Disparity Compensation Module 220.
  • the Time Alignment and Disparity Compensation Module 220 aligns the video from the N-FOV camera 204 with the video from the W-FOV camera 202.
  • the View Selection Module 222 selects between the video from the W-FOV camera 202 and the video from the V-FOV camera 204. For example, if the selected object of interest is within the field of view of the N-FOV camera, the image from the N-FOV camera is selected for display. If the object of interest is not within the field of view of the N- FOV camera, a cropped and upscaled image from the W-FOV camera is selected for display.
  • FIG. 3 depicts an example sequence diagram for RFID based tagging and tracking of objects of interest in camera views, in accordance with an embodiment.
  • FIG. 3 depicts a sequence of events between an RFID tracking infrastructure 302, a smart-phone 304 having a W-FOV camera lens 306, a N-FOV camera lens 308, and a display 310. While FIG. 3 depicts a smartphone video camera, the solution may be realized with other video capture devices.
  • the RFID infrastructure 302 provides a continual broadcast of RFID position information of tracked objects to the smartphone 304.
  • a mapping between the RFID data and the camera views is performed based on the camera's position identified at step 314.
  • wide FOV video is captured.
  • a tracked object is detected in the W-FOV video.
  • the object is identified. The detection may be based on feature extraction of the camera image.
  • the video from the W-FOV camera is displayed with the mapped RFID meta-data overlaid on the image.
  • the user selects an object of interest to track.
  • the user selection is based on a touchscreen input, a text input, voice recognition, and the like.
  • the RFID infrastructure provides location information to the smartphone for a sub-set of tracked objects. For example, during a sporting game, each player's position is tracked, but only the subset of players that are in the game, as opposed to waiting on the side-lines, have their position information provided to the smartphone.
  • a desired view of the N-FOV camera is determined.
  • the desired view contains the image of the selected object of interest.
  • the desired N- FOV camera view may be obtained by changing the optical parameters of the N-FOV camera, changing the direction of a swivel-based N-FOV camera, and the like.
  • the controls to obtain the desired N- FOV camera view are determined, and at step 332, the controls are provided to the N-FOV camera.
  • the N- FOV camera receives the instructions and at step 332 responsively captures the desired view.
  • the image may further be cropped and zoomed, as appropriate.
  • the image is displayed.
  • the display may be on the smartphone screen or it may also be transmitted for display on a remote device.
  • FIG. 4 depicts a video capture device displaying RFID information overlaid on a video, in accordance with an embodiment.
  • FIG. 4 depicts the video capture device 400.
  • the video capture device may be realized with a smartphone comprising both a N-FOV camera lens and a W-FOV camera lens located on a side of the smartphone opposite the smartphone's display.
  • the display of the smartphone depicts the view of the W-FOV camera lens on the top portion of the display and the view from the N-FOV camera lens on the bottom portion of the display.
  • the RFID location information has been mapped to the camera views, and RFID metadata is overlaid on the video image at locations corresponding to the associated objects of interest.
  • the metadata may be formatted and customized.
  • the jersey number for each player is displayed over the image of the player, as determined by the mapping between the RFID location information and the camera position.
  • the metadata may instead depict the player's name, statistics associated with the player, and the like.
  • the look and feel of the displayed metadata may also be formatted. For example, the background color and the font color of the displayed metadata may change. In one such example, information for players on a first team is displayed in the first team's colors and information for players on a second team is displayed in the second team's colors.
  • the user may select to follow an object of interest displayed in the W-FOV camera view.
  • the user has selected to follow the player associated with the "12" RFID metadata.
  • a desired field of view as captured by the N-FOV camera, is determined to capture the player associated with the "12" RFID metadata.
  • Camera controls are provided to the N-FOV camera to obtain the desired view for the selected player.
  • the N-FOV camera operates per the received controls to obtain the desired view.
  • the view from the N-FOV camera is then displayed.
  • the RFID metadata information is displayed on the zoomed in N-FOV camera view.
  • the smartphone displays the W-FOV camera view and after receiving user input for a selected object of interest, transitions to the N-FOV camera view of the selected object of interest.
  • FIGs. 5A-C depict a series of video frames illustrating a method of dealing with an optical occlusion, in accordance with an embodiment.
  • the locations of the objects are determined based on both image-based tracking and RFID-based tracking.
  • the image-based location tracking may be prevented by optical occlusions.
  • FIG. 5A depicts a video frame at a first point in time, with the location of an object of interest (in this example, runner 502) being tracked.
  • the position of the object of interest may be indicated on a display of video capture device, e.g. with rectangle 504.
  • FIG. 5B depicts a video frame at a second point in time after the first point in time.
  • the object of interest 502 is occluded by a tree 506.
  • the video capture device is able to continue tracking the object of interest using wirelessly- received coordinates of the object. These coordinates may be derived, for example, from GPS data and/or from RFID tracking.
  • the video capture device tracks the position and orientation of the camera capturing the video and maps the wirelessly-received coordinates of the object of interest to a mapped location within the video. While the object 502 is occluded and optical tracking is unavailable, an indicator such as rectangle
  • FIG. 5C depicts a video frame at a third point in time after the second point in time.
  • the object of interest 502 is no longer occluded by the tree, and optical tracking may resume.
  • the indicator 504 may be displayed to indicate the position of the object of interest as determined by optical tracking.
  • the location of the object of interest may initially be tracked with image-based locating methods. However, these methods may fail in the event of image occlusion, such as the occlusion shown in FIG. 5B.
  • image occlusion such as the occlusion shown in FIG. 5B.
  • the location of a selected object of interest may be supplemented by RFID-based locating methods.
  • a camera system provides a desired N-FOV view of the selected object of interest.
  • the desired N-FOV view is based on image-based locating methods.
  • the desired N-FOV camera view may be maintained by shifting the object of interest locating method from an image-based locating method to an RFID-based locating method, such as the methods disclosed herein.
  • the locating method may be shifted back after the occlusion is cleared or after a set period of time.
  • FIG. 6 depicts an example method, in accordance with an embodiment.
  • FIG. 6 depicts the method 600.
  • the method 600 may be carried out using the system architecture of FIG. 2.
  • the method 600 includes capturing a first image of a scene with a first camera having a W-FOV.
  • the method 600 also includes receiving object location information from an RFID-based locating system.
  • the method 600 also includes determining the position and orientation of the camera system in relation to a coordinate system used for the RFID-based object locating system.
  • the method 600 also includes mapping the location of the tracked objects to the image of the W-FOV camera based on the determined position and orientation of the camera.
  • the method 600 also includes detecting the tracked objects in the W-FOV camera lens image.
  • the method 600 also includes mapping metadata associated with the tracked objects to the detected objects in the W-FOV camera lens image.
  • the method 600 also includes displaying the metadata in the image, overlaid on the video from the W-FOV camera.
  • the method 600 includes receiving a selection of an object of interest.
  • the method 600 includes, in response to an indication of a selected object of interest, determining a desired view of the W-FOV camera to obtain the selected object of interest.
  • a desired view of the N-FOV camera to obtain the selected object of interest is determined.
  • the N-FOV camera lens captures an image of a portion of the scene captured by the W-FOV camera lens.
  • the images from the N-FOV and W-FOV cameras are stabilized and aligned.
  • the W-FOV camera image may be cropped and upscaled to match the N-FOV camera image. Both camera images include the selected object of interest.
  • the method 600 includes selecting between the W-FOV camera view and the N-FOV camera view.
  • a selection module may determine to display either the video from the W-FOV camera or the N-FOV camera.
  • the selected view is displayed.
  • the method 600 may further be modified to select an object of interest in the W-FOV camera view, detect the selected object of interest in the N-FOV camera view, and crop and zoom in on the selected object of interest in the N-FOV camera view. The cropped and zoomed view from the N-FOV camera is then displayed.
  • FIG. 7 depicts an example method, in accordance with an embodiment.
  • the method 700 includes capturing a video of a scene with a camera.
  • the method 700 includes determining a position and orientation of the camera.
  • the method 700 includes wirelessly receiving coordinates indicating the position of wirelessly tracked objects.
  • the method 700 includes mapping the wirelessly received coordinates to respective mapped locations in the captured video.
  • the method 700 may include receiving a request to select an object of interest to track.
  • the method 700 includes determining if the object of interest is obscured in the captured video.
  • the method 700 includes optically tracking the selected object of interest if the object is not obscured.
  • the method 700 includes tracking the selected object of interest using the respective mapped location in the captured video, if the object is obscured.
  • a method of providing object identification and tracking in a system with multiple camera modules includes the following steps.
  • a first image of a scene is captured with a first camera having a wide field of view.
  • Object position information for a tracked object is received.
  • the position and orientation of the first camera in relation to a coordinate system used for the object position information is determined.
  • the object position information is mapped into the coordinates of the first camera.
  • the tracked object in the first image is detected.
  • Information from the object position is fused with the detected tracked objects. Tracked objects in the first image from the first camera are identified using the object position information.
  • a description of the tracked object is overlaid on the first image of the first camera provided to a display.
  • a second image of the scene is captured from a second camera having a narrow field of view, the video of the first and second cameras is stabilized, images from both the first camera's video and from the second camera's video are extracted, a source for a zoom view between the first and second cameras is selected, and the image from the selected source are displayed.
  • a video capture device includes a wide-field of view camera, a narrow field of view camera, a processor, and a non-transitory computer-readable medium storing instructions that are operative, when executed on the processor, to perform the functions described in the preceding paragraph.
  • the object position information is a radio-frequency identification (RFID) based position.
  • the object position information for a tracked object comprises image- based position information, and responsive to detecting a failure of the image-based position information, radio-frequency identification (RFID) based position information is received.
  • RFID radio-frequency identification
  • the method further includes editing the video from the second camera prior to displaying the image from the selected source, wherein the editing includes the following steps.
  • An object in the first image is selected.
  • the tracked object is detected in the second image.
  • the second image is cropped to zoom in on the tracked object in the second image.
  • the multiple camera modules are disposed on a wireless communication device.
  • the wireless communication device is a smart phone.
  • fusing information from the object position with the detected tracked objects includes calibrating the video cameras to an RFID-locating system reference system. In some such embodiments, fusing information from the object position with the detected tracked objects includes performing an orthogonal mapping.
  • a method includes the following steps.
  • a first video of a scene is captured with a wide field of view (W-FOV) camera.
  • Position information and metadata information for a tracked object within the scene is received.
  • the position information is mapped to the tracked object within the scene.
  • the metadata information is displayed on a display of the first video based on the mapped position.
  • a video- tracking request for the tracked object is received.
  • a second video of a portion of the scene is captured with a narrow field of view (N-FOV) camera having the tracked object, and the second video is displayed.
  • N-FOV narrow field of view
  • a video capture device includes a wide-field of view camera, a narrow field of view camera, a processor, and a non-transitory computer-readable medium storing instructions that are operative, when executed on the processor to perform the functions described in the preceding paragraph.
  • either one or both of the W-FOV and N-FOV cameras include configurable optical settings and capturing video of the scene includes configuring the optical settings for the respective camera to capture the tracked object.
  • either one or both of the W-FOV and N-FOV cameras include a configurable orientation and capturing video of the scene includes configuring the orientation for the respective camera to capture the tracked object.
  • the configurable orientation includes a swivel-based camera.
  • the position information is RFID-based position information.
  • the position information is received continually.
  • the position information is received on a per-frame basis.
  • the position information is received periodically.
  • the video-tracking request includes receiving a touch-screen input associated with detecting a user touch on a location of the displayed metadata.
  • the method further includes determining the tracked object is no longer in the second image and responsively transitioning to displaying a cropped view of the first image, the cropped view including the tracked object.
  • the method further includes displaying, on a first portion of the display the first image with overlaid meta data, and displaying, on a second portion that is different from the first portion of the display the second image.
  • the method also includes displaying overlaid metadata on the second portion of the display.
  • the position information of the tracked object incudes image-based position information
  • the method further includes receiving radio-frequency identification (RFID) based position information in response to detecting a failure of the image-based position information.
  • RFID radio-frequency identification
  • detecting a failure of the image-based position information includes detecting an optical occlusion between the camera and the tracked object.
  • the method further includes transitioning the position-based information to the image-based position information in response to detecting a subsequent availability of the image-based position information.
  • detecting the subsequent availability of the image-based position information includes detecting that an optical occlusion is no longer between the camera and the tracked object.
  • a method includes the following steps.
  • a first video of a scene is captured with a wide field of view (W-FOV) camera.
  • RFID-based position information and for a tracked object in the captured scene is received.
  • Metadata associated with the tracked object on the first video is displayed based on a mapping of the position information to the tracked object in the first video.
  • a zoom-request for the tracked object is received.
  • a second video of the scene including the tracked object is captured with a N-FOV camera, wherein capturing the second video is based on the position information of the tracked object.
  • the second video is displayed.
  • the method further includes (i) determining that the second video no longer comprises the tracked object and (ii) transitioning to displaying a cropped and upscaled version of the first video, the cropped and upscaled version including the tracked object.
  • ROM read only memory
  • RAM random access memory
  • register cache memory
  • semiconductor memory devices magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD-ROM disks, and digital versatile disks (DVDs).
  • a processor in association with software may be used to implement a radio frequency transceiver for use in a WTRU, UE, terminal, base station, RNC, or any host computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Studio Devices (AREA)

Abstract

L'invention concerne des systèmes et des procédés de suivi vidéo mis en œuvre pour pouvoir suivre un ou plusieurs objets dans une vidéo même si les objets sont masqués ou indisponibles pour une autre raison à des fins de procédés de suivi optique. Dans un tel procédé, une vidéo d'une scène est capturée à l'aide d'un dispositif équipé d'une caméra. Un objet sélectionné dans la vidéo capturée est suivi optiquement pour déterminer un emplacement suivi de manière optique à l'intérieur de la vidéo capturée. Une position et une orientation de la caméra sont déterminées. Le dispositif reçoit sans fil des coordonnées qui indiquent la position de l'objet sélectionné. Sur la base de la position et de l'orientation de la caméra, les coordonnées reçues sont mappées sur un emplacement mappé dans la vidéo capturée, qui peut être représenté par des coordonnées de pixel. En réponse à une détermination selon laquelle l'objet sélectionné est caché dans la vidéo capturée, l'emplacement mappé est utilisé pour suivre l'objet sélectionné.
PCT/US2018/034653 2017-06-01 2018-05-25 Suivi d'objets d'intérêt par objectif zoom sur la base d'une identification par radiofréquence (rfid) WO2018222532A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201762514006P 2017-06-01 2017-06-01
US62/514,006 2017-06-01

Publications (1)

Publication Number Publication Date
WO2018222532A1 true WO2018222532A1 (fr) 2018-12-06

Family

ID=62599751

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2018/034653 WO2018222532A1 (fr) 2017-06-01 2018-05-25 Suivi d'objets d'intérêt par objectif zoom sur la base d'une identification par radiofréquence (rfid)

Country Status (1)

Country Link
WO (1) WO2018222532A1 (fr)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100026809A1 (en) * 2008-07-29 2010-02-04 Gerald Curry Camera-based tracking and position determination for sporting events
EP2264679A1 (fr) * 2008-03-11 2010-12-22 Panasonic Corporation Système détecteur d'étiquette et dispositif détecteur, et dispositif d'estimation de position d'objet et procédé d'estimation de position d'objet
US20110135149A1 (en) * 2009-12-09 2011-06-09 Pvi Virtual Media Services, Llc Systems and Methods for Tracking Objects Under Occlusion
US20130329950A1 (en) * 2012-06-12 2013-12-12 Electronics And Telecommunications Research Institute Method and system of tracking object
WO2014026065A1 (fr) * 2012-08-09 2014-02-13 Sapoznikow Michael Systèmes et procédés de suivi de joueurs sur la base de données vidéo et de données d'identification par radiofréquence (rfid)
US20160335484A1 (en) * 2015-03-11 2016-11-17 Fortinet, Inc. Access point stream and video surveillance stream based object location detection and activity analysis

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2264679A1 (fr) * 2008-03-11 2010-12-22 Panasonic Corporation Système détecteur d'étiquette et dispositif détecteur, et dispositif d'estimation de position d'objet et procédé d'estimation de position d'objet
US20100026809A1 (en) * 2008-07-29 2010-02-04 Gerald Curry Camera-based tracking and position determination for sporting events
US20110135149A1 (en) * 2009-12-09 2011-06-09 Pvi Virtual Media Services, Llc Systems and Methods for Tracking Objects Under Occlusion
US20130329950A1 (en) * 2012-06-12 2013-12-12 Electronics And Telecommunications Research Institute Method and system of tracking object
WO2014026065A1 (fr) * 2012-08-09 2014-02-13 Sapoznikow Michael Systèmes et procédés de suivi de joueurs sur la base de données vidéo et de données d'identification par radiofréquence (rfid)
US20160335484A1 (en) * 2015-03-11 2016-11-17 Fortinet, Inc. Access point stream and video surveillance stream based object location detection and activity analysis

Similar Documents

Publication Publication Date Title
US9721392B2 (en) Server, client terminal, system, and program for presenting landscapes
US10468066B2 (en) Video content selection
US9836886B2 (en) Client terminal and server to determine an overhead view image
WO2018164932A1 (fr) Codage de grossissement utilisant de multiples captures de caméras simultanées et synchrones
US9159169B2 (en) Image display apparatus, imaging apparatus, image display method, control method for imaging apparatus, and program
CN108399349B (zh) 图像识别方法及装置
US20190253747A1 (en) Systems and methods for integrating and delivering objects of interest in video
US8350931B2 (en) Arrangement and method relating to an image recording device
US8750559B2 (en) Terminal and method for providing augmented reality
JP5532026B2 (ja) 表示装置、表示方法及びプログラム
US10674066B2 (en) Method for processing image and electronic apparatus therefor
US8478308B2 (en) Positioning system for adding location information to the metadata of an image and positioning method thereof
US20130054137A1 (en) Portable apparatus
US9635234B2 (en) Server, client terminal, system, and program
US20140210941A1 (en) Image capture apparatus, image capture method, and image capture program
JP2019114147A (ja) 情報処理装置、情報処理装置の制御方法及びプログラム
US10681335B2 (en) Video recording method and apparatus
US20210258505A1 (en) Image processing apparatus, image processing method, and storage medium
WO2015068447A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations, et système de traitement d'informations
WO2022206595A1 (fr) Procédé de traitement d'images et dispositif associé
US20130265332A1 (en) Information processing apparatus, control method of information processing apparatus, and storage medium storing program
WO2018222532A1 (fr) Suivi d'objets d'intérêt par objectif zoom sur la base d'une identification par radiofréquence (rfid)
JP4156552B2 (ja) 撮像システム、撮像装置、撮像方法、及び撮像プログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18731311

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18731311

Country of ref document: EP

Kind code of ref document: A1