WO2016040404A1 - Video capture with privacy safeguard - Google Patents
Video capture with privacy safeguard Download PDFInfo
- Publication number
- WO2016040404A1 WO2016040404A1 PCT/US2015/049055 US2015049055W WO2016040404A1 WO 2016040404 A1 WO2016040404 A1 WO 2016040404A1 US 2015049055 W US2015049055 W US 2015049055W WO 2016040404 A1 WO2016040404 A1 WO 2016040404A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video
- human
- video camera
- recording
- imaging system
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/20—Scenes; Scene-specific elements in augmented reality scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
- G06V10/12—Details of acquisition arrangements; Constructional details thereof
- G06V10/14—Optical characteristics of the device performing the acquisition or on the illumination arrangements
- G06V10/143—Sensing or illuminating at different wavelengths
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B19/00—Driving, starting, stopping record carriers not specifically of filamentary or web form, or of supports therefor; Control thereof; Control of operating function ; Driving both disc and head
- G11B19/02—Control of operating function, e.g. switching from recording to reproducing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/45—Cameras or camera modules comprising electronic image sensors; Control thereof for generating image signals from two or more image sensors being of different type or operating in different modes, e.g. with a CMOS sensor for moving images in combination with a charge-coupled device [CCD] for still images
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/61—Control of cameras or camera modules based on recognised objects
- H04N23/611—Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/765—Interface circuits between an apparatus for recording and another apparatus
- H04N5/77—Interface circuits between an apparatus for recording and another apparatus between a recording apparatus and a television camera
- H04N5/772—Interface circuits between an apparatus for recording and another apparatus between a recording apparatus and a television camera the recording apparatus and the television camera being placed in the same enclosure
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/44—Event detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
Definitions
- Video-recording technology is increasingly ubiquitous in the world today.
- Portable electronic devices such as cellular telephones, tablet computers, near-eye displays, and handheld game systems, for example, may include cameras and associated software to enable video capture.
- a method to record video with a video camera while respecting bystander privacy includes acquiring sensory data separate from the video, parsing the sensory data for evidence of a human being in a field of view of the video camera, and recording video with the video camera if no human being is detected in the field of view based upon the sensory data.
- FIGS. 1 and 2 show aspects of an example imaging system in accordance with an embodiment of this disclosure.
- FIGS. 3 and 4 are example thermal maps of a bystander imaged with far-infrared (FIR) sensor; in FIG. 4, the bystander is making a hand gesture.
- FIR far-infrared
- FIG. 5 illustrates an example method to record video with a video camera while respecting bystander privacy, in accordance with an embodiment of this disclosure.
- FIG. 6 is a plot representing a fraction of background pixels of an FIR image erroneously detected as belonging to a face, relative to a fraction of faces detected, in accordance with an embodiment of this disclosure.
- video may be captured automatically and/or continuously. This feature enables the device wearer to accumulate a video record of his or her daily activities and later review the record for subjects of particular interest. Though continuous or automatic video recording on a portable device may provide benefit for the device operator, bystanders may not wish to be included in images recorded by the device. Accordingly, examples are disclosed herein that may address such issues.
- FIG. 1 shows aspects of an example imaging system 10 in one, non- limiting example.
- the imaging system of FIG. 1 takes the form of a wearable, near-eye display system with continuous video-capture capability; it includes a controller 12 operatively coupled to right and left display elements 14.
- the controller sends appropriate drive signals to each display element to control the virtual display imagery formed therein.
- each display element includes a light-emitting diode (LED) backlight positioned behind a transmissive liquid-crystal display (LCD) array.
- LCD transmissive liquid-crystal display
- Other display- element examples may include a reflective LCD array such as a liquid-crystal-on-silicon (LCOS) array.
- LCOS liquid-crystal-on-silicon
- an active-matrix LED array or scanning laser beam may be used to provide the virtual display imagery.
- the right and left display elements are optically coupled each to a corresponding display window 15.
- Each display window may be configured with beam-turning and/or pupil expanding functionality, so that the virtual display images formed by display elements 15 are presented to the wearer's eyes.
- Display windows 15 may be at least partially transparent. This feature allows the virtual imagery from display elements 14 to be combined with real imagery sighted through the display windows, to provide an 'augmented reality' (AR) experience for the wearer of imaging system 10.
- the wearer herein, is more generally referred to as an Operator' or 'user' of the imaging system.
- video camera 16 may be configured to record any or all of the real imagery 18 sighted by the operator through display windows 15.
- the video camera includes an objective lens system 20 that collects light over a field of view (FOV) 22 and directs such light onto an imaging array 24.
- the imaging array of the video camera may be a high-speed, high-resolution red / green / blue (RGB) complementary metal oxide semiconductor (CMOS) array, in one example.
- CMOS complementary metal oxide semiconductor
- the imaging array is operatively coupled to controller 12, which receives image data from the array.
- controller 12 Positioned between the objective lens system and the imaging-array aperture is an electronically closable shutter 26.
- Imaging system 10 may be configured to support various input modalities in order to receive operator input.
- pushbuttons arranged on the frames of the imaging system may support manual input.
- a microphone and associated speech- recognition logic in controller 12 may support voice recognition.
- the imaging system may be configured to track the gaze direction of the operator, and to apply the gaze direction as a form of operator input.
- imaging system 10 of FIG. 1 includes right and left eye-imaging cameras 28.
- the eye -imaging cameras image the operator's eyes to resolve such features as the pupil centers, pupil outlines, or corneal glints created by off-axis illumination of the eyes.
- the positions of such features in the right and left eye images are provided as input parameters to a model, executed in controller 12, that computes gaze direction.
- a model executed in controller 12, that computes gaze direction.
- the gaze direction may be used as position data for interacting with a graphical user interface projected into a user's field of view and/or for receiving eye gesture inputs, for example.
- image data from the eye-imaging cameras may be used to assess eyelid opening and closure— e.g., to detect winking and blinking, which also may serve as forms of operator input.
- video camera 16 of imaging system 10 may be configured to automatically record the real imagery sighted the operator of the imaging system and located within the FOV of the video camera. This scenario is shown also in FIG. 2, where bystander 32 is present in FOV 22, and is sighted by operator 30. This disclosure is directed, in part, to safeguarding the privacy of the bystander, who may not want to be recorded.
- imaging system 10 may be configured to record video only when it is determined that no bystanders are within FOV 22, except those who have confirmed their willingness to be recorded.
- controller 12 of FIG. 1 includes a face-recognition engine 34 configured to process the video stream acquired by video camera 16.
- the face-recognition engine may have access to one or more stored facial images (or other identifying information) of persons confirmed as willing to be recorded. If a bystander is encountered who is not confirmed as willing to be recorded, then video recording may be suspended until that person's willingness can be confirmed ⁇ vide infra).
- Triggers of various kinds may be used to initiate video recording on startup of imaging system 10, or to resume video recording after it has been suspended, in a manner respectful of bystander privacy. For example, in some scenarios, it may be left to the operator to determine whether an unwilling bystander is present in FOV 22, and to initiate / resume recording when no such bystander is present. In other scenarios, a few frames of video may be captured provisionally and analyzed in face-recognition engine 34. If the face-recognition engine determines that the FOV includes no bystanders except those whose willingness has been confirmed, then continuous video capture may be enabled.
- imaging system 10 may include a sensor 36, which is separate from video camera 16 but configured to acquire sensory data at least over the FOV 22 of the video camera.
- sensor 36 has an FOV 38, which overlaps FOV 22.
- Controller 12 may be configured to parse sensory data from the sensor for evidence of a human being in FOV 22 and to enable recording of video with the video camera if no human being is detected in the FOV, based upon the sensory data.
- sensor 36 is a far-infrared (FIR) sensor—i.e., a non-contact temperature sensor.
- the FIR sensor may be responsive over a wavelength range of 1 to 10 micrometers, in some examples. Both imaging and non-imaging FIR sensors may be useful for detecting the presence of human beings.
- a nonimaging ⁇ e.g., single pixel FIR sensor may be used to determine whether any object in the video camera's FOV is above a threshold temperature— e.g., > 30 °C at the surface of the object sighted by the sensor.
- Controller 12 may be configured to initiate or resume video recording only if no such object is present in the FOV.
- high-pass filtering of the sensory signal may be used to distinguish a moving human being from a warm, stationary object, such as a lamp.
- an imaging FIR sensor 36 be used to detect human beings based on thermal mapping.
- sensor 36 may be a MEMS-based thermopile- array sensor, for example.
- the sensor array may have a resolution and 'color' depth significantly lower than of video camera 16.
- Such a sensor may output a relatively low- resolution thermal image of the FOV of the video camera, as shown in FIG. 3.
- Controller 12 may be configured to analyze the thermal image to detect one or more human- like shapes and to initiate or resume video recording only if no human-like shape is present in the FOV.
- the controller may include a shape-recognition engine 40 (referring again to FIG. 1).
- Higher-resolution FIR image data even if readily available, may not be desirable for at least two reasons. First, the compute power required to analyze an image increases as the square of the resolution, so using a lower resolution may help to conserve system resources. Second, it is possible that sufficiently high-resolution FIR image data may allow an unwilling bystander to be identified.
- sensor 36 in the form of an imaging FIR array may also be used to support gesture detection.
- a bystander aware that video may be recorded in her proximity, may use a gesture to signal either willingness or unwillingness to be recorded. Examples include a 'thumbs-up' gesture to signal willingness, or, as illustrated in FIG. 4, a 'Let me be' gesture to signal unwillingness.
- Shape-recognition engine 40 may be configured to detect such gestures in a low-resolution thermal map or in other sensor data.
- FIR-based sensory data for indicating the presence of a human being, and optionally detect gestures
- other types of sensory data may be used instead of or in addition to FIR-based sensory data.
- Virtually any form of sensory data may be utilized as long as the data allows a bystander to be identified as a human being, but does not enable the bystander to be identified.
- the sensory data may be chosen to provide below-threshold fidelity in imaging the subject and/or the environment in which the subject is located. Additional sensory modes adaptable for this purpose may include low-resolution visible or near-infrared imaging, low-resolution time-of-flight depth imaging, ultrasonic and millimeter-wave imaging, among others.
- FIG. 5 illustrates an example method 42 to record video with a video camera while respecting bystander privacy.
- sensory data separate from the video is acquired.
- the sensory data may include imaging or non-imaging FIR data, or virtually any type of data that enables a human being to be detected without being identified.
- the sensory data is parsed for evidence of a bystander—i.e., a human being in the FOV of the video camera.
- the evidence of the human being may include a warm locus in the FOV, a warm moving locus, or, if a thermal image is available, a shape of above -threshold temperature corresponding to a head and body of a human being, as examples.
- FIG. 6 shows results of a simple experiment estimating the fraction of pixels from FIR footage collected over many thousands of frames in a warm outdoor temperature (24 °C), a cold outdoor temperature (11 °C), a cold indoor garage temperature (14 °C), an indoor office temperature (21 °C), and an indoor lobby temperature (19 °C) with people constantly coming in from cold outside settings.
- a single temperature threshold is chosen for these settings.
- the graph compares the fraction of faces for which at least one pixel is above that threshold (on the x axis) to the fraction of pixels falsely designated as faces (the y axis). At a threshold of 85 °F, for instance, approximately 89% of faces are recognized while approximately 3% of non-face pixels are detected.
- Another way to reduce the occurrence of false positives in human-shape detection may be to filter the selected foreground pixels (or a single pixel in a non-imaging sensor configuration) based on whether such pixels exhibit a temporal variation consistent with an underlying physiological process—e.g., breathing or heart beat.
- imaging sensory data may be subject to principal component analysis.
- the method advances to 50, where it is determined whether confirmation of the bystander's willingness to be recorded has been received.
- the bystander's willingness may be confirmed in any suitable manner. For instance, the bystander may signal or otherwise indicate to the operator of the imaging system that he or she is willing to be recorded. The operator, then, may provide touch input, vocal input, gaze-direction input, etc., to the imaging system, which indicates that the bystander is willing to be recorded.
- the bystander's signal may be transmitted electronically, wirelessly, or in the form of a light pulse received by sensor 36 of imaging system 10.
- confirmation of the bystander's willingness may come in the form of a hand or body gesture— e.g., a thumbs-up gesture. This kind of gesture, detected via a thermal map or other low- resolution image data, may serve to confirm the bystander's willingness to be recorded.
- the act of confirming whether the bystander's willingness has been received may be non-invasive (or even unknown) to the bystander.
- the bystander may be directly queried.
- a feature vector may be assembled from the sensory data for each bystander encountered, and stored in a database if the bystander is confirmed unwilling to be recorded. Direct querying of the bystander may then be omitted if the bystander's feature vector matches that of a stored, unwilling bystander. It will be noted that assembled feature vectors may be of sufficient fidelity to enable a positive match of a previously observed bystander, but of insufficient fidelity to enable the bystander to be identified.
- recording of the video is initiated. Conversely, in scenarios where the sensory data provides sufficient evidence of a human being in the FOV of the video camera, recording of the video is delayed until confirmation is received that the human being is willing to be recorded. Recording of video with the video camera may be initiated automatically if no human being is detected in the video camera's FOV, based on the sensory data.
- the video is parsed in order to recognize one or more human beings. Then, at 56, it is determined whether the parsed video includes sufficient evidence (e.g., a face) of a bystander who is not already confirmed as willing to be recorded. This act may require the storing of one or more images of persons confirmed as willing to be recorded, for comparison against the real-time video. If the video contains sufficient evidence of a bystander not confirmed as willing to be recorded, then, at 58, the recoding of the video is suspended. Otherwise, the method advances to 60 and 61 , where the video and sensory data (optionally) is parsed for a hand or body gesture indicating unwillingness of the bystander to be recorded.
- sufficient evidence e.g., a face
- a gesture of this kind may also be used by a bystander to opt out of video recording, even if willingness to be recorded was previously signaled.
- the gesture may include a hand over a face of the bystander (as shown in FIG. 2), an alternative 'Let me be' gesture (as shown in FIG. 4), or virtually any other gesture identifiable in the video and/or sensory data. If, at 62, a gesture indicating unwillingness is recognized, then the method advances to 58, where video recording is suspended.
- video recording may be suspended in software— e.g., by not acquiring image frames via the video camera.
- a more positive act may be taken to suspend recording of the video.
- the appropriate gate bias may be removed from the imaging array of the video camera, which prevents any image from being formed.
- a physical shutter arranged over the camera aperture may be closed to prevent exposure of the array to real imagery. An advantage of the latter approach is that it broadcasts to the wary bystander that video capture has been disabled.
- the act of suspending video recording may be accompanied by flushing one or more of the most recent frames of already-acquired video from the memory of controller 12, to further protect the bystander's privacy.
- method 42 After video recording is suspended, execution of method 42 returns to 44, where additional sensory data is acquired. Next, it is again determined whether a threshold amount of sensory evidence of a human being exists in the FOV, and if so, whether that human being is bystander who may be willing to be recorded. Recording of the video may be resumed when it is confirmed that the bystander is willing to be recorded.
- the methods and processes described herein may be tied to a compute system of one or more computing machines— e.g., controller 12 of FIG. 1. Such methods and processes may be implemented as a hardware driver program or service, an application-programming interface (API), a library, and/or other computer-program product. Each computing machine includes a logic machine 64, an associated computer-memory machine 66, and a communication machine 68.
- API application-programming interface
- Each logic machine includes one or more physical logic devices configured to execute instructions.
- a logic machine may be configured to execute instructions that are part of one or more applications, services, programs, routines, libraries, objects, components, data structures, or other logical constructs. Such instructions may be implemented to perform a task, implement a data type, transform the state of one or more components, achieve a technical effect, or otherwise arrive at a desired result.
- a logic machine may include one or more processors configured to execute software instructions. Additionally or alternatively, a logic machine may include one or more hardware or firmware logic machines configured to execute hardware or firmware instructions. Processors of a logic machine may be single-core or multi-core, and the instructions executed thereon may be configured for sequential, parallel, and/or distributed processing. Individual components of a logic machine optionally may be distributed among two or more separate devices, which may be remotely located and/or configured for coordinated processing. Aspects of a logic machine may be virtualized and executed by remotely accessible, networked computing devices configured in a cloud-computing configuration.
- Each computer-memory machine includes one or more physical, computer- memory devices configured to hold instructions executable by an associated logic machine to implement the methods and processes described herein. When such methods and processes are implemented, the state of the computer-memory machine may be transformed— e.g., to hold different data.
- a computer-memory machine may include removable and/or built-in devices; it may include optical memory (e.g., CD, DVD, HD- DVD, Blu-Ray Disc, etc.), semiconductor memory (e.g., RAM, EPROM, EEPROM, etc.), and/or magnetic memory (e.g., hard-disk drive, floppy-disk drive, tape drive, MRAM, etc.), among others.
- a computer-memory machine may include volatile, nonvolatile, dynamic, static, read/write, read-only, random-access, sequential-access, location- addressable, file-addressable, and/or content-addressable devices.
- a computer-memory machine includes one or more physical devices.
- aspects of the instructions described herein alternatively may be propagated by a communication medium (e.g., an electromagnetic signal, an optical signal, etc.), as opposed to being stored via a storage medium.
- aspects of a logic machine and associated computer-memory machine may be integrated together into one or more hardware-logic components.
- Such hardware-logic components may include field-programmable gate arrays (FPGAs), program- and application-specific integrated circuits (PASIC / ASICs), program- and application-specific standard products (PSSP / ASSPs), system-on-a-chip (SOC), and complex programmable logic devices (CPLDs), for example.
- FPGAs field-programmable gate arrays
- PASIC / ASICs program- and application-specific integrated circuits
- PSSP / ASSPs program- and application- specific standard products
- SOC system-on-a-chip
- CPLDs complex programmable logic devices
- program' and 'engine' may be used to describe an aspect of a computer system implemented to perform a particular function.
- a program or engine may be instantiated via a logic machine executing instructions held by a computer-memory machine. It will be understood that different programs and engines may be instantiated from the same application, service, code block, object, library, routine, API, function, etc. Likewise, the same module, program, or engine may be instantiated by different applications, services, code blocks, objects, routines, APIs, functions, etc.
- a module, program, or engine may encompass individual or groups of executable files, data files, libraries, drivers, scripts, database records, etc.
- a communication machine may be configured to communicatively couple the compute system to one or more other machines, including server computer systems.
- the communication machine may include wired and/or wireless communication devices compatible with one or more different communication protocols.
- a communication machine may be configured for communication via a wireless telephone network, or a wired or wireless local- or wide-area network.
- a communication machine may allow a computing machine to send and/or receive messages to and/or from other devices via a network such as the Internet.
- the method comprises acquiring thermal sensory data separate from the video, parsing the thermal sensory data for evidence of a human being in a field of view of the video camera, and recording video with the video camera if no human being is detected in the field of view based on the thermal sensory data.
- the above method may additionally or alternatively comprises, if a human being is detected in the field of view based upon the thermal sensory data, delaying recording of the video until confirmation is received that the human being is willing to be recorded.
- the above method may additionally or alternatively comprise parsing the video to recognize one or more human beings.
- the above method may additionally or alternatively comprise suspending recording of the video on recognizing a human being not confirmed as willing to be recorded.
- the above method may additionally or alternatively comprise resuming recording of the video when it is confirmed that the human being is willing to be recorded.
- the above method may additionally or alternatively comprise parsing the video to recognize a hand or body gesture of a human being.
- the above method may additionally or alternatively comprise suspending recording of the video on recognizing a hand or body gesture indicating unwillingness to be recorded. In some implementations, the above method may additionally or alternatively comprise parsing the thermal sensory data to recognize a hand or body gesture of a human being indicating willingness to be recorded, and resuming recording of the video on recognizing the hand or body gesture.
- Another example provides an imaging system comprising a video camera; separate from the video camera, a sensor configured to acquire sensory data over a field of view of the video camera; and a controller configured to parse the sensory data for evidence of a human being in the field of view of the video camera and to enable recording of video with the video camera if no human being is detected in the field of view based upon the sensory data.
- the senor of the above imaging system may additionally or alternatively comprise a far-infrared sensor.
- the sensor may additionally or alternatively comprise a thermopile array sensor.
- the sensor may additionally or alternatively comprise an imaging sensor of lower resolution and/or color depth than the video camera.
- the above imaging system may additionally or alternatively comprise an electronically closable shutter arranged over an aperture of the video camera, wherein the controller is configured to keep the shutter closed when recording of the video is not enabled.
- the above imaging system is wearable and/or configured for continuous video acquisition.
- Another aspect of this disclosure is directed to another method to record video with a video camera while respecting bystander privacy.
- This method comprises acts of: acquiring far-infrared sensory data separate from the video; parsing the far-infrared sensory data for evidence of a human being in a field of view of the video camera; if no human being is detected in the field of view based upon the far-infrared sensory data, recording video with the video camera; if a human being is detected in the field of view based upon the far-infrared sensory data, delaying recording of the video until confirmation is received that the detected human being is willing to be recorded; parsing the video to recognize a human being; suspending recording of the video on determining that a recognized human being is not confirmed as willing to be recorded; parsing the video to recognize a gesture of a human being indicating unwillingness to be recorded; and suspending recording of the video on recognizing the gesture.
- the above method may additionally or alternatively comprise storing one or more images of human beings confirmed as willing to be recorded.
- the evidence of the human being may additionally or alternatively include a far-infrared image corresponding to a head and body shape of a human being.
- the gesture indicating unwillingness may additionally or alternatively include a hand over a face of the human being.
- Some implementations of the above method may additionally or alternatively comprise parsing the far-infrared sensory data to recognize a gesture of one or more human beings indicating willingness to be recorded, and resuming recording of the video on recognizing the gesture indicating willingness to be recorded.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Studio Devices (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
A method to record video with a video camera while respecting bystander privacy includes acquiring sensory data separate from the video, parsing the sensory data for evidence of a human being in a field of view of the video camera, and recording video with the video camera if no human being is detected in the field of view, based upon the sensory data.
Description
VIDEO CAPTURE WITH PRIVACY SAFEGUARD
BACKGROUND
[0001] Video-recording technology is increasingly ubiquitous in the world today. Portable electronic devices such as cellular telephones, tablet computers, near-eye displays, and handheld game systems, for example, may include cameras and associated software to enable video capture.
SUMMARY
[0002] In one example, a method to record video with a video camera while respecting bystander privacy is provided. The method includes acquiring sensory data separate from the video, parsing the sensory data for evidence of a human being in a field of view of the video camera, and recording video with the video camera if no human being is detected in the field of view based upon the sensory data.
[0003] This Summary is provided to introduce a selection of concepts in simplified form that are further described below in the detailed description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to limit the scope of the claimed subject matter. Furthermore, the claimed subject matter is not limited to implementations that solve any disadvantage noted in any part of this disclosure.
BRIEF DESCRIPTION OF THE DRAWINGS
[0004] FIGS. 1 and 2 show aspects of an example imaging system in accordance with an embodiment of this disclosure.
[0005] FIGS. 3 and 4 are example thermal maps of a bystander imaged with far-infrared (FIR) sensor; in FIG. 4, the bystander is making a hand gesture.
[0006] FIG. 5 illustrates an example method to record video with a video camera while respecting bystander privacy, in accordance with an embodiment of this disclosure.
[0007] FIG. 6 is a plot representing a fraction of background pixels of an FIR image erroneously detected as belonging to a face, relative to a fraction of faces detected, in accordance with an embodiment of this disclosure.
DETAILED DESCRIPTION
[0008] In cases where a portable electronic device is wearable, video may be captured automatically and/or continuously. This feature enables the device wearer to accumulate a video record of his or her daily activities and later review the record for subjects of particular interest. Though continuous or automatic video recording on a portable device
may provide benefit for the device operator, bystanders may not wish to be included in images recorded by the device. Accordingly, examples are disclosed herein that may address such issues.
[0009] FIG. 1 shows aspects of an example imaging system 10 in one, non- limiting example. The imaging system of FIG. 1 takes the form of a wearable, near-eye display system with continuous video-capture capability; it includes a controller 12 operatively coupled to right and left display elements 14. The controller sends appropriate drive signals to each display element to control the virtual display imagery formed therein. In one example, each display element includes a light-emitting diode (LED) backlight positioned behind a transmissive liquid-crystal display (LCD) array. Other display- element examples may include a reflective LCD array such as a liquid-crystal-on-silicon (LCOS) array. In still other examples, an active-matrix LED array or scanning laser beam may be used to provide the virtual display imagery. In the embodiment of FIG. 1, the right and left display elements are optically coupled each to a corresponding display window 15. Each display window may be configured with beam-turning and/or pupil expanding functionality, so that the virtual display images formed by display elements 15 are presented to the wearer's eyes.
[0010] Display windows 15 may be at least partially transparent. This feature allows the virtual imagery from display elements 14 to be combined with real imagery sighted through the display windows, to provide an 'augmented reality' (AR) experience for the wearer of imaging system 10. The wearer, herein, is more generally referred to as an Operator' or 'user' of the imaging system.
[0011] Continuing in FIG. 1, video camera 16 may be configured to record any or all of the real imagery 18 sighted by the operator through display windows 15. The video camera includes an objective lens system 20 that collects light over a field of view (FOV) 22 and directs such light onto an imaging array 24. The imaging array of the video camera may be a high-speed, high-resolution red / green / blue (RGB) complementary metal oxide semiconductor (CMOS) array, in one example. In FIG. 1, the imaging array is operatively coupled to controller 12, which receives image data from the array. Positioned between the objective lens system and the imaging-array aperture is an electronically closable shutter 26. The shutter is configured to close in response to a closure signal from controller 12, thereby preventing video capture under specified conditions, and also providing a visual cue to bystanders that video capture is disabled.
[0012] Imaging system 10 may be configured to support various input modalities in order to receive operator input. For example, pushbuttons arranged on the frames of the imaging system may support manual input. Also, a microphone and associated speech- recognition logic in controller 12 may support voice recognition. Alternatively, or in addition, the imaging system may be configured to track the gaze direction of the operator, and to apply the gaze direction as a form of operator input. To this end, imaging system 10 of FIG. 1 includes right and left eye-imaging cameras 28. The eye -imaging cameras image the operator's eyes to resolve such features as the pupil centers, pupil outlines, or corneal glints created by off-axis illumination of the eyes. The positions of such features in the right and left eye images are provided as input parameters to a model, executed in controller 12, that computes gaze direction. Once the gaze direction is computed, it may be used as position data for interacting with a graphical user interface projected into a user's field of view and/or for receiving eye gesture inputs, for example. Further, in some examples, image data from the eye-imaging cameras may be used to assess eyelid opening and closure— e.g., to detect winking and blinking, which also may serve as forms of operator input.
[0013] As noted above, video camera 16 of imaging system 10 may be configured to automatically record the real imagery sighted the operator of the imaging system and located within the FOV of the video camera. This scenario is shown also in FIG. 2, where bystander 32 is present in FOV 22, and is sighted by operator 30. This disclosure is directed, in part, to safeguarding the privacy of the bystander, who may not want to be recorded.
[0014] With a traditional hand-held video camera, the mere act of holding the video camera and pointing it toward a subject broadcasts the operator's intent to capture video. A bystander, aware that the recording is taking place but unwilling to be recorded, may avoid the camera or at least signal unwillingness to the operator. However, when a video camera is not held in the operator's hand, but integrated in eyewear, clothing, or otherwise worn, a bystander may have no knowledge that he or she is being recorded and no opportunity to opt out of the recording. In addition, a bystander, discovering that he or she is a subject of on-going recording activity, may feel that his or her privacy has been violated.
[0015] To address this issue, imaging system 10 may be configured to record video only when it is determined that no bystanders are within FOV 22, except those who have confirmed their willingness to be recorded. Accordingly, controller 12 of FIG. 1 includes a
face-recognition engine 34 configured to process the video stream acquired by video camera 16. The face-recognition engine may have access to one or more stored facial images (or other identifying information) of persons confirmed as willing to be recorded. If a bystander is encountered who is not confirmed as willing to be recorded, then video recording may be suspended until that person's willingness can be confirmed {vide infra).
[0016] Triggers of various kinds may be used to initiate video recording on startup of imaging system 10, or to resume video recording after it has been suspended, in a manner respectful of bystander privacy. For example, in some scenarios, it may be left to the operator to determine whether an unwilling bystander is present in FOV 22, and to initiate / resume recording when no such bystander is present. In other scenarios, a few frames of video may be captured provisionally and analyzed in face-recognition engine 34. If the face-recognition engine determines that the FOV includes no bystanders except those whose willingness has been confirmed, then continuous video capture may be enabled.
[0017] In other examples, dedicated hardware of imaging system 10 can be used to initiate / resume video recording without requiring explicit operator input or collection of even one frame of video. Continuing with FIG. 1, imaging system 10 may include a sensor 36, which is separate from video camera 16 but configured to acquire sensory data at least over the FOV 22 of the video camera. In the example shown in FIG. 1, sensor 36 has an FOV 38, which overlaps FOV 22. Controller 12 may be configured to parse sensory data from the sensor for evidence of a human being in FOV 22 and to enable recording of video with the video camera if no human being is detected in the FOV, based upon the sensory data.
[0018] The nature of sensor 36 may differ in various implementations of this disclosure. In one example, sensor 36 is a far-infrared (FIR) sensor— i.e., a non-contact temperature sensor. The FIR sensor may be responsive over a wavelength range of 1 to 10 micrometers, in some examples. Both imaging and non-imaging FIR sensors may be useful for detecting the presence of human beings. In one very basic example, a nonimaging {e.g., single pixel) FIR sensor may be used to determine whether any object in the video camera's FOV is above a threshold temperature— e.g., > 30 °C at the surface of the object sighted by the sensor. Controller 12 may be configured to initiate or resume video recording only if no such object is present in the FOV. In some examples, high-pass filtering of the sensory signal may be used to distinguish a moving human being from a warm, stationary object, such as a lamp.
[0019] In other examples, an imaging FIR sensor 36 be used to detect human beings based on thermal mapping. Accordingly, sensor 36 may be a MEMS-based thermopile- array sensor, for example. The sensor array may have a resolution and 'color' depth significantly lower than of video camera 16. Such a sensor may output a relatively low- resolution thermal image of the FOV of the video camera, as shown in FIG. 3. Controller 12 may be configured to analyze the thermal image to detect one or more human- like shapes and to initiate or resume video recording only if no human-like shape is present in the FOV. To this end, the controller may include a shape-recognition engine 40 (referring again to FIG. 1). Higher-resolution FIR image data, even if readily available, may not be desirable for at least two reasons. First, the compute power required to analyze an image increases as the square of the resolution, so using a lower resolution may help to conserve system resources. Second, it is possible that sufficiently high-resolution FIR image data may allow an unwilling bystander to be identified.
[0020] Continuing with FIG. 1, sensor 36 in the form of an imaging FIR array may also be used to support gesture detection. For example, in one example scenario, a bystander, aware that video may be recorded in her proximity, may use a gesture to signal either willingness or unwillingness to be recorded. Examples include a 'thumbs-up' gesture to signal willingness, or, as illustrated in FIG. 4, a 'Let me be' gesture to signal unwillingness. Shape-recognition engine 40 may be configured to detect such gestures in a low-resolution thermal map or in other sensor data.
[0021] While this disclosure describes FIR-based sensory data for indicating the presence of a human being, and optionally detect gestures, it will be understood that other types of sensory data may be used instead of or in addition to FIR-based sensory data. Virtually any form of sensory data may be utilized as long as the data allows a bystander to be identified as a human being, but does not enable the bystander to be identified. As such, the sensory data may be chosen to provide below-threshold fidelity in imaging the subject and/or the environment in which the subject is located. Additional sensory modes adaptable for this purpose may include low-resolution visible or near-infrared imaging, low-resolution time-of-flight depth imaging, ultrasonic and millimeter-wave imaging, among others.
[0022] Although the foregoing drawings and description feature an imaging system in the form of a near-eye display system worn on the face of an operator, the solutions disclosed herein are equally applicable to video capture by devices worn around the neck, concealed in clothing or accessories {e.g., a hat), by cellular telephones, tablet computers,
handheld game systems, and other portable electronic devices. It is also envisaged that certain stationary video-capture systems may be adapted, as presently disclosed, to safeguard the privacy of bystanders. Machine vision in a gaming environment, for example, may be initiated only after it is determined that no unwilling bystanders (e.g., non-players) are present in the system's FOV. Machine vision may be paused, additionally, when an unrecognized bystander wanders into the FOV of the machine- vision system.
[0023] The configurations described above may enable various methods for video- recording to be enacted in an imaging system. Some such methods are now described with continued reference to the above configurations. It will be understood, however, that the methods here described, and others within the scope of this disclosure, also may be enabled by different configurations.
[0024] FIG. 5 illustrates an example method 42 to record video with a video camera while respecting bystander privacy. At 44, sensory data separate from the video is acquired. As noted above, the sensory data may include imaging or non-imaging FIR data, or virtually any type of data that enables a human being to be detected without being identified.
[0025] At 46, the sensory data is parsed for evidence of a bystander— i.e., a human being in the FOV of the video camera. The evidence of the human being may include a warm locus in the FOV, a warm moving locus, or, if a thermal image is available, a shape of above -threshold temperature corresponding to a head and body of a human being, as examples.
[0026] FIG. 6 shows results of a simple experiment estimating the fraction of pixels from FIR footage collected over many thousands of frames in a warm outdoor temperature (24 °C), a cold outdoor temperature (11 °C), a cold indoor garage temperature (14 °C), an indoor office temperature (21 °C), and an indoor lobby temperature (19 °C) with people constantly coming in from cold outside settings. A single temperature threshold is chosen for these settings. The graph compares the fraction of faces for which at least one pixel is above that threshold (on the x axis) to the fraction of pixels falsely designated as faces (the y axis). At a threshold of 85 °F, for instance, approximately 89% of faces are recognized while approximately 3% of non-face pixels are detected.
[0027] It will be understood that these data represent early results, and that further discrimination based on empirical temperature patterns for the face (as opposed to single- threshold rejection) may help to further suppress false positive detection. In the
experiment described above, most of the 11% occurrence of failed face recognition occurred when the face was directed away from the video camera. When the face is directed towards the camera, higher detection rates are observed, missing perhaps 0.1 % of face-to-face interactions, based on current analysis.
[0028] Another way to reduce the occurrence of false positives in human-shape detection may be to filter the selected foreground pixels (or a single pixel in a non-imaging sensor configuration) based on whether such pixels exhibit a temporal variation consistent with an underlying physiological process— e.g., breathing or heart beat. To this end, imaging sensory data may be subject to principal component analysis.
[0029] Returning now to FIG. 5, at 48 it is determined whether evidence (e.g. above a threshold amount) is found to indicate a bystander's presence. If so, the method advances to 50, where it is determined whether confirmation of the bystander's willingness to be recorded has been received. The bystander's willingness may be confirmed in any suitable manner. For instance, the bystander may signal or otherwise indicate to the operator of the imaging system that he or she is willing to be recorded. The operator, then, may provide touch input, vocal input, gaze-direction input, etc., to the imaging system, which indicates that the bystander is willing to be recorded. In other examples, the bystander's signal may be transmitted electronically, wirelessly, or in the form of a light pulse received by sensor 36 of imaging system 10. In still other examples, confirmation of the bystander's willingness may come in the form of a hand or body gesture— e.g., a thumbs-up gesture. This kind of gesture, detected via a thermal map or other low- resolution image data, may serve to confirm the bystander's willingness to be recorded.
[0030] In some embodiments, the act of confirming whether the bystander's willingness has been received may be non-invasive (or even unknown) to the bystander. In other embodiments, the bystander may be directly queried. To prevent repeated querying of the same unwilling bystander, a feature vector may be assembled from the sensory data for each bystander encountered, and stored in a database if the bystander is confirmed unwilling to be recorded. Direct querying of the bystander may then be omitted if the bystander's feature vector matches that of a stored, unwilling bystander. It will be noted that assembled feature vectors may be of sufficient fidelity to enable a positive match of a previously observed bystander, but of insufficient fidelity to enable the bystander to be identified.
[0031] If confirmation of the bystander's willingness is received, then, at 52, recording of the video is initiated. Conversely, in scenarios where the sensory data provides
sufficient evidence of a human being in the FOV of the video camera, recording of the video is delayed until confirmation is received that the human being is willing to be recorded. Recording of video with the video camera may be initiated automatically if no human being is detected in the video camera's FOV, based on the sensory data.
[0032] At 54 of method 42, the video is parsed in order to recognize one or more human beings. Then, at 56, it is determined whether the parsed video includes sufficient evidence (e.g., a face) of a bystander who is not already confirmed as willing to be recorded. This act may require the storing of one or more images of persons confirmed as willing to be recorded, for comparison against the real-time video. If the video contains sufficient evidence of a bystander not confirmed as willing to be recorded, then, at 58, the recoding of the video is suspended. Otherwise, the method advances to 60 and 61 , where the video and sensory data (optionally) is parsed for a hand or body gesture indicating unwillingness of the bystander to be recorded. A gesture of this kind may also be used by a bystander to opt out of video recording, even if willingness to be recorded was previously signaled. The gesture may include a hand over a face of the bystander (as shown in FIG. 2), an alternative 'Let me be' gesture (as shown in FIG. 4), or virtually any other gesture identifiable in the video and/or sensory data. If, at 62, a gesture indicating unwillingness is recognized, then the method advances to 58, where video recording is suspended.
[0033] In some examples, video recording may be suspended in software— e.g., by not acquiring image frames via the video camera. In other examples, a more positive act may be taken to suspend recording of the video. For example, the appropriate gate bias may be removed from the imaging array of the video camera, which prevents any image from being formed. In other examples, a physical shutter arranged over the camera aperture may be closed to prevent exposure of the array to real imagery. An advantage of the latter approach is that it broadcasts to the wary bystander that video capture has been disabled. In some examples, the act of suspending video recording may be accompanied by flushing one or more of the most recent frames of already-acquired video from the memory of controller 12, to further protect the bystander's privacy.
[0034] After video recording is suspended, execution of method 42 returns to 44, where additional sensory data is acquired. Next, it is again determined whether a threshold amount of sensory evidence of a human being exists in the FOV, and if so, whether that human being is bystander who may be willing to be recorded. Recording of the video may be resumed when it is confirmed that the bystander is willing to be recorded.
[0035] As evident from the foregoing description, the methods and processes described herein may be tied to a compute system of one or more computing machines— e.g., controller 12 of FIG. 1. Such methods and processes may be implemented as a hardware driver program or service, an application-programming interface (API), a library, and/or other computer-program product. Each computing machine includes a logic machine 64, an associated computer-memory machine 66, and a communication machine 68.
[0036] Each logic machine includes one or more physical logic devices configured to execute instructions. A logic machine may be configured to execute instructions that are part of one or more applications, services, programs, routines, libraries, objects, components, data structures, or other logical constructs. Such instructions may be implemented to perform a task, implement a data type, transform the state of one or more components, achieve a technical effect, or otherwise arrive at a desired result.
[0037] A logic machine may include one or more processors configured to execute software instructions. Additionally or alternatively, a logic machine may include one or more hardware or firmware logic machines configured to execute hardware or firmware instructions. Processors of a logic machine may be single-core or multi-core, and the instructions executed thereon may be configured for sequential, parallel, and/or distributed processing. Individual components of a logic machine optionally may be distributed among two or more separate devices, which may be remotely located and/or configured for coordinated processing. Aspects of a logic machine may be virtualized and executed by remotely accessible, networked computing devices configured in a cloud-computing configuration.
[0038] Each computer-memory machine includes one or more physical, computer- memory devices configured to hold instructions executable by an associated logic machine to implement the methods and processes described herein. When such methods and processes are implemented, the state of the computer-memory machine may be transformed— e.g., to hold different data. A computer-memory machine may include removable and/or built-in devices; it may include optical memory (e.g., CD, DVD, HD- DVD, Blu-Ray Disc, etc.), semiconductor memory (e.g., RAM, EPROM, EEPROM, etc.), and/or magnetic memory (e.g., hard-disk drive, floppy-disk drive, tape drive, MRAM, etc.), among others. A computer-memory machine may include volatile, nonvolatile, dynamic, static, read/write, read-only, random-access, sequential-access, location- addressable, file-addressable, and/or content-addressable devices.
[0039] It will be appreciated that a computer-memory machine includes one or more physical devices. However, aspects of the instructions described herein alternatively may be propagated by a communication medium (e.g., an electromagnetic signal, an optical signal, etc.), as opposed to being stored via a storage medium.
[0040] Aspects of a logic machine and associated computer-memory machine may be integrated together into one or more hardware-logic components. Such hardware-logic components may include field-programmable gate arrays (FPGAs), program- and application-specific integrated circuits (PASIC / ASICs), program- and application- specific standard products (PSSP / ASSPs), system-on-a-chip (SOC), and complex programmable logic devices (CPLDs), for example.
[0041] The terms program' and 'engine' may be used to describe an aspect of a computer system implemented to perform a particular function. In some cases, a program or engine may be instantiated via a logic machine executing instructions held by a computer-memory machine. It will be understood that different programs and engines may be instantiated from the same application, service, code block, object, library, routine, API, function, etc. Likewise, the same module, program, or engine may be instantiated by different applications, services, code blocks, objects, routines, APIs, functions, etc. A module, program, or engine may encompass individual or groups of executable files, data files, libraries, drivers, scripts, database records, etc.
[0042] A communication machine may be configured to communicatively couple the compute system to one or more other machines, including server computer systems. The communication machine may include wired and/or wireless communication devices compatible with one or more different communication protocols. As non-limiting examples, a communication machine may be configured for communication via a wireless telephone network, or a wired or wireless local- or wide-area network. In some examples, a communication machine may allow a computing machine to send and/or receive messages to and/or from other devices via a network such as the Internet.
[0043] It will be understood that the configurations and/or approaches described herein are exemplary in nature, and that these specific examples are not to be considered in a limiting sense, because numerous variations are possible. The specific routines or methods described herein may represent one or more of any number of processing strategies. As such, various acts illustrated and/or described may be performed in the sequence illustrated and/or described, in other sequences, in parallel, or omitted. Likewise, the order of the above-described processes may be changed.
[0044] Another example provides a method to record video with a video camera while respecting bystander privacy. The method comprises acquiring thermal sensory data separate from the video, parsing the thermal sensory data for evidence of a human being in a field of view of the video camera, and recording video with the video camera if no human being is detected in the field of view based on the thermal sensory data.
[0045] In some implementations, the above method may additionally or alternatively comprises, if a human being is detected in the field of view based upon the thermal sensory data, delaying recording of the video until confirmation is received that the human being is willing to be recorded. In some implementations, the above method may additionally or alternatively comprise parsing the video to recognize one or more human beings. In some implementations, the above method may additionally or alternatively comprise suspending recording of the video on recognizing a human being not confirmed as willing to be recorded. In some implementations, the above method may additionally or alternatively comprise resuming recording of the video when it is confirmed that the human being is willing to be recorded. In some implementations, the above method may additionally or alternatively comprise parsing the video to recognize a hand or body gesture of a human being. In some implementations, the above method may additionally or alternatively comprise suspending recording of the video on recognizing a hand or body gesture indicating unwillingness to be recorded. In some implementations, the above method may additionally or alternatively comprise parsing the thermal sensory data to recognize a hand or body gesture of a human being indicating willingness to be recorded, and resuming recording of the video on recognizing the hand or body gesture.
[0046] Another example provides an imaging system comprising a video camera; separate from the video camera, a sensor configured to acquire sensory data over a field of view of the video camera; and a controller configured to parse the sensory data for evidence of a human being in the field of view of the video camera and to enable recording of video with the video camera if no human being is detected in the field of view based upon the sensory data.
[0047] In some implementations, the sensor of the above imaging system may additionally or alternatively comprise a far-infrared sensor. In some implementations, the sensor may additionally or alternatively comprise a thermopile array sensor. In some implementations, the sensor may additionally or alternatively comprise an imaging sensor of lower resolution and/or color depth than the video camera. In Some implementations of the above imaging system may additionally or alternatively comprise an electronically
closable shutter arranged over an aperture of the video camera, wherein the controller is configured to keep the shutter closed when recording of the video is not enabled. In some implementations, the above imaging system is wearable and/or configured for continuous video acquisition.
[0048] Another aspect of this disclosure is directed to another method to record video with a video camera while respecting bystander privacy. This method comprises acts of: acquiring far-infrared sensory data separate from the video; parsing the far-infrared sensory data for evidence of a human being in a field of view of the video camera; if no human being is detected in the field of view based upon the far-infrared sensory data, recording video with the video camera; if a human being is detected in the field of view based upon the far-infrared sensory data, delaying recording of the video until confirmation is received that the detected human being is willing to be recorded; parsing the video to recognize a human being; suspending recording of the video on determining that a recognized human being is not confirmed as willing to be recorded; parsing the video to recognize a gesture of a human being indicating unwillingness to be recorded; and suspending recording of the video on recognizing the gesture.
[0049] In some implementations, the above method may additionally or alternatively comprise storing one or more images of human beings confirmed as willing to be recorded. In some implementations, the evidence of the human being may additionally or alternatively include a far-infrared image corresponding to a head and body shape of a human being. In some implementations, the gesture indicating unwillingness may additionally or alternatively include a hand over a face of the human being. Some implementations of the above method may additionally or alternatively comprise parsing the far-infrared sensory data to recognize a gesture of one or more human beings indicating willingness to be recorded, and resuming recording of the video on recognizing the gesture indicating willingness to be recorded.
[0050] The subject matter of the present disclosure includes all novel and non-obvious combinations and sub-combinations of the various processes, systems and configurations, and other features, functions, acts, and/or properties disclosed herein, as well as any and all equivalents thereof.
Claims
1. A method to record video with a video camera while respecting bystander privacy, the method comprising:
acquiring from an infrared light sensor an input of thermal sensory data separate from the video;
parsing the thermal sensory data for evidence of a human being in a field of view of the video camera; and
recording video with the video camera if no human being is detected in the field of view based on the thermal sensory data.
2. The method of claim 1, further comprising, if a human being is detected in the field of view based upon the thermal sensory data, delaying recording of the video until confirmation is received that the human being is willing to be recorded.
3. The method of claim 1, further comprising parsing the video to recognize one or more human beings.
4. The method of claim 3, further comprising suspending recording of the video on recognizing a human being not confirmed as willing to be recorded.
5. The method of claim 4, further comprising resuming recording of the video when it is confirmed that the human being is willing to be recorded.
6. The method of claim 1, further comprising parsing the video to recognize a hand or body gesture of a human being.
7. The method of claim 6, further comprising suspending recording of the video on recognizing a hand or body gesture indicating unwillingness to be recorded.
8. The method of claim 5, further comprising parsing the thermal sensory data to recognize a hand or body gesture of a human being indicating willingness to be recorded, and resuming recording of the video on recognizing the hand or body gesture.
9. An imaging system comprising:
a video camera;
separate from the video camera, a sensor configured to acquire sensory data over a field of view of the video camera; and
a controller configured to parse the sensory data for evidence of a human being in the field of view of the video camera and to enable recording of video with the video camera if no human being is detected in the field of view based upon the sensory data.
10. The imaging system of claim 9, wherein the sensor is a far-infrared sensor.
11. The imaging system of claim 9, wherein the sensor is a thermopile array sensor.
12. The imaging system of claim 9, wherein the sensor is an imaging sensor of lower resolution and/or color depth than the video camera.
13. The imaging system of claim 9, further comprising an electronically closable shutter arranged over an aperture of the video camera, wherein the controller is configured to keep the shutter closed when recording of the video is not enabled.
14. The imaging system of claim 9, wherein the imaging system is wearable.
15. The imaging system of claim 9, wherein the imaging system is configured for continuous video acquisition.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP15770702.7A EP3192006A1 (en) | 2014-09-12 | 2015-09-09 | Video capture with privacy safeguard |
CN201580048995.1A CN107077598B (en) | 2014-09-12 | 2015-09-09 | Video capture with privacy protection |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201462049943P | 2014-09-12 | 2014-09-12 | |
US62/049,943 | 2014-09-12 | ||
US14/593,839 | 2015-01-09 | ||
US14/593,839 US10602054B2 (en) | 2014-09-12 | 2015-01-09 | Video capture with privacy safeguard |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016040404A1 true WO2016040404A1 (en) | 2016-03-17 |
Family
ID=55456081
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2015/049055 WO2016040404A1 (en) | 2014-09-12 | 2015-09-09 | Video capture with privacy safeguard |
Country Status (4)
Country | Link |
---|---|
US (1) | US10602054B2 (en) |
EP (1) | EP3192006A1 (en) |
CN (1) | CN107077598B (en) |
WO (1) | WO2016040404A1 (en) |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016077798A1 (en) | 2014-11-16 | 2016-05-19 | Eonite Perception Inc. | Systems and methods for augmented reality preparation, processing, and application |
US9916002B2 (en) | 2014-11-16 | 2018-03-13 | Eonite Perception Inc. | Social applications for augmented reality technologies |
US10043319B2 (en) | 2014-11-16 | 2018-08-07 | Eonite Perception Inc. | Optimizing head mounted displays for augmented reality |
US10102419B2 (en) * | 2015-10-30 | 2018-10-16 | Intel Corporation | Progressive radar assisted facial recognition |
US11017712B2 (en) | 2016-08-12 | 2021-05-25 | Intel Corporation | Optimized display image rendering |
US9928660B1 (en) | 2016-09-12 | 2018-03-27 | Intel Corporation | Hybrid rendering for a wearable display attached to a tethered computer |
US11074359B2 (en) | 2017-06-05 | 2021-07-27 | International Business Machines Corporation | Privacy focused network sensor device object recognition |
US10754996B2 (en) | 2017-09-15 | 2020-08-25 | Paypal, Inc. | Providing privacy protection for data capturing devices |
US10679039B2 (en) * | 2018-04-03 | 2020-06-09 | Google Llc | Detecting actions to discourage recognition |
US20190349517A1 (en) * | 2018-05-10 | 2019-11-14 | Hanwha Techwin Co., Ltd. | Video capturing system and network system to support privacy mode |
EP3855349A1 (en) * | 2020-01-27 | 2021-07-28 | Microsoft Technology Licensing, LLC | Extracting information about people from sensor signals |
CN112181152B (en) * | 2020-11-13 | 2023-05-26 | 幻蝎科技(武汉)有限公司 | Advertisement pushing management method, device and application based on MR (magnetic resonance) glasses |
US20220383512A1 (en) * | 2021-05-27 | 2022-12-01 | Varjo Technologies Oy | Tracking method for image generation, a computer program product and a computer system |
US11696011B2 (en) | 2021-10-21 | 2023-07-04 | Raytheon Company | Predictive field-of-view (FOV) and cueing to enforce data capture and transmission compliance in real and near real time video |
US11792499B2 (en) * | 2021-10-21 | 2023-10-17 | Raytheon Company | Time-delay to enforce data capture and transmission compliance in real and near real time video |
WO2023087215A1 (en) * | 2021-11-18 | 2023-05-25 | Citrix Systems, Inc. | Online meeting non-participant detection and remediation |
US11700448B1 (en) | 2022-04-29 | 2023-07-11 | Raytheon Company | Computer/human generation, validation and use of a ground truth map to enforce data capture and transmission compliance in real and near real time video of a local scene |
Family Cites Families (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1418555B1 (en) * | 1999-12-17 | 2007-10-10 | Siemens Schweiz AG | Presence detector and use thereof |
MXPA04010813A (en) * | 2002-04-29 | 2005-03-07 | Thomson Licensing Sa | Method and apparatus for controlling digital recording and associated user interfaces. |
GB2400514B (en) | 2003-04-11 | 2006-07-26 | Hewlett Packard Development Co | Image capture method |
CN101127834B (en) | 2003-05-20 | 2011-04-06 | 松下电器产业株式会社 | Image capturing system |
US20040233282A1 (en) * | 2003-05-22 | 2004-11-25 | Stavely Donald J. | Systems, apparatus, and methods for surveillance of an area |
KR100548418B1 (en) * | 2003-10-31 | 2006-02-02 | 엘지전자 주식회사 | Restriction method and system for photographing in handheld terminal combined with camera |
US7590405B2 (en) | 2005-05-10 | 2009-09-15 | Ewell Jr Robert C | Apparatus for enabling a mobile communicator and methods of using the same |
US20070237358A1 (en) * | 2006-04-11 | 2007-10-11 | Wei-Nan William Tseng | Surveillance system with dynamic recording resolution and object tracking |
IL176369A0 (en) | 2006-06-18 | 2007-06-03 | Photo Free Ltd | A system & method for preventing photography |
US7817914B2 (en) | 2007-05-30 | 2010-10-19 | Eastman Kodak Company | Camera configurable for autonomous operation |
US8750578B2 (en) * | 2008-01-29 | 2014-06-10 | DigitalOptics Corporation Europe Limited | Detecting facial expressions in digital images |
CN101667235B (en) | 2008-09-02 | 2013-10-23 | 北京瑞星信息技术有限公司 | Method and device for protecting user privacy |
GB2485534B (en) | 2010-11-15 | 2016-08-17 | Edesix Ltd | Imaging recording apparatus |
CN102681652B (en) | 2011-03-09 | 2016-03-30 | 联想(北京)有限公司 | A kind of implementation method of safety input and terminal |
US20120277914A1 (en) * | 2011-04-29 | 2012-11-01 | Microsoft Corporation | Autonomous and Semi-Autonomous Modes for Robotic Capture of Images and Videos |
CN103167230B (en) * | 2011-12-17 | 2017-10-03 | 富泰华工业(深圳)有限公司 | Electronic equipment and its method taken pictures according to gesture control |
EP2820837A4 (en) | 2012-03-01 | 2016-03-09 | H4 Eng Inc | Apparatus and method for automatic video recording |
JP6316540B2 (en) * | 2012-04-13 | 2018-04-25 | 三星電子株式会社Samsung Electronics Co.,Ltd. | Camera device and control method thereof |
GB2519769B (en) * | 2013-10-29 | 2016-09-28 | Cp Electronics Ltd | Apparatus for controlling an electrical load |
CN103957103B (en) | 2014-04-17 | 2017-07-04 | 小米科技有限责任公司 | The method of safety verification, device and mobile terminal |
US9179105B1 (en) * | 2014-09-15 | 2015-11-03 | Belkin International, Inc. | Control of video camera with privacy feedback |
-
2015
- 2015-01-09 US US14/593,839 patent/US10602054B2/en active Active
- 2015-09-09 EP EP15770702.7A patent/EP3192006A1/en not_active Withdrawn
- 2015-09-09 CN CN201580048995.1A patent/CN107077598B/en active Active
- 2015-09-09 WO PCT/US2015/049055 patent/WO2016040404A1/en active Application Filing
Non-Patent Citations (6)
Title |
---|
ANONYMOUS: "Thermographic camera - Wikipedia, the free encyclopedia", 10 July 2014 (2014-07-10), XP055234610, Retrieved from the Internet <URL:https://en.wikipedia.org/w/index.php?title=Thermographic_camera&oldid=616310920> [retrieved on 20151208] * |
ASHWIN ASHOK ET AL: "Do not share!", VISIBLE LIGHT COMMUNICATION SYSTEMS, ACM, 2 PENN PLAZA, SUITE 701 NEW YORK NY 10121-0701 USA, 7 September 2014 (2014-09-07), pages 39 - 44, XP058056082, ISBN: 978-1-4503-3067-1, DOI: 10.1145/2643164.2643168 * |
FRANZISKA ROESNER ET AL: "World-Driven Access Control for Continuous Sensing", COMPUTER AND COMMUNICATIONS SECURITY, 19 May 2014 (2014-05-19), 2 Penn Plaza, Suite 701 New York NY 10121-0701 USA, XP055234606, ISBN: 978-1-4503-2957-6, Retrieved from the Internet <URL:http://research.microsoft.com/pubs/217301/wdac-tr.pdf> [retrieved on 20151208] * |
MUKHTAJ S BARHM ET AL: "Negotiating Privacy Preferences in Video Surveillance Systems", 28 June 2011, MODERN APPROACHES IN APPLIED INTELLIGENCE, SPRINGER BERLIN HEIDELBERG, BERLIN, HEIDELBERG, PAGE(S) 511 - 521, ISBN: 978-3-642-21826-2, XP047024567 * |
SEUNGYEOP HAN ET AL: "GlimpseData", PHYSICAL ANALYTICS, ACM, 2 PENN PLAZA, SUITE 701 NEW YORK NY 10121-0701 USA, 11 June 2014 (2014-06-11), pages 31 - 36, XP058052785, ISBN: 978-1-4503-2825-8, DOI: 10.1145/2611264.2611269 * |
TAMARA DENNING ET AL: "In situ with bystanders of augmented reality glasses", HUMAN FACTORS IN COMPUTING SYSTEMS, ACM, 2 PENN PLAZA, SUITE 701 NEW YORK NY 10121-0701 USA, 26 April 2014 (2014-04-26), pages 2377 - 2386, XP058046882, ISBN: 978-1-4503-2473-1, DOI: 10.1145/2556288.2557352 * |
Also Published As
Publication number | Publication date |
---|---|
US20160080642A1 (en) | 2016-03-17 |
EP3192006A1 (en) | 2017-07-19 |
CN107077598A (en) | 2017-08-18 |
CN107077598B (en) | 2020-10-27 |
US10602054B2 (en) | 2020-03-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10602054B2 (en) | Video capture with privacy safeguard | |
US11281288B2 (en) | Eye and head tracking | |
US10372751B2 (en) | Visual search in real world using optical see-through head mounted display with augmented reality and user interaction tracking | |
US12073647B2 (en) | Detecting device, detecting method, and recording medium | |
KR102561991B1 (en) | Eyewear-mountable eye tracking device | |
EP3117417B1 (en) | Remote device control via gaze detection | |
CN106062776B (en) | The eye tracking of polarization | |
US10278576B2 (en) | Behind-eye monitoring using natural reflection of lenses | |
US10303245B2 (en) | Methods and devices for detecting and responding to changes in eye conditions during presentation of video on electronic devices | |
US20140201844A1 (en) | Detection of and privacy preserving response to observation of display screen | |
US20160182814A1 (en) | Automatic camera adjustment to follow a target | |
US9961307B1 (en) | Eyeglass recorder with multiple scene cameras and saccadic motion detection | |
US20210378509A1 (en) | Pupil assessment using modulated on-axis illumination | |
US11328187B2 (en) | Information processing apparatus and information processing method | |
WO2024021251A1 (en) | Identity verification method and apparatus, and electronic device and storage medium | |
US20230319428A1 (en) | Camera comprising lens array |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15770702 Country of ref document: EP Kind code of ref document: A1 |
|
DPE2 | Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101) | ||
REEP | Request for entry into the european phase |
Ref document number: 2015770702 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2015770702 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |