WO2021108913A1 - Système vidéo, procédé d'étalonnage du système vidéo et procédé de capture d'une image à l'aide du système vidéo - Google Patents

Système vidéo, procédé d'étalonnage du système vidéo et procédé de capture d'une image à l'aide du système vidéo Download PDF

Info

Publication number
WO2021108913A1
WO2021108913A1 PCT/CA2020/051661 CA2020051661W WO2021108913A1 WO 2021108913 A1 WO2021108913 A1 WO 2021108913A1 CA 2020051661 W CA2020051661 W CA 2020051661W WO 2021108913 A1 WO2021108913 A1 WO 2021108913A1
Authority
WO
WIPO (PCT)
Prior art keywords
cameras
controller
video system
camera
perforated screen
Prior art date
Application number
PCT/CA2020/051661
Other languages
English (en)
Inventor
Daniel LABONTÉ
Hugo BOUJUT-BURGUN
Original Assignee
Studio Thinkwell Montréal Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Studio Thinkwell Montréal Inc. filed Critical Studio Thinkwell Montréal Inc.
Publication of WO2021108913A1 publication Critical patent/WO2021108913A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N9/00Details of colour television systems
    • H04N9/12Picture reproducers
    • H04N9/31Projection devices for colour picture display, e.g. using electronic spatial light modulators [ESLM]
    • H04N9/3141Constructional details thereof
    • H04N9/3147Multi-projection systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N9/00Details of colour television systems
    • H04N9/12Picture reproducers
    • H04N9/31Projection devices for colour picture display, e.g. using electronic spatial light modulators [ESLM]
    • H04N9/3179Video signal processing therefor
    • H04N9/3182Colour adjustment, e.g. white balance, shading or gamut
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N9/00Details of colour television systems
    • H04N9/12Picture reproducers
    • H04N9/31Projection devices for colour picture display, e.g. using electronic spatial light modulators [ESLM]
    • H04N9/3191Testing thereof
    • H04N9/3194Testing thereof including sensor feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/765Interface circuits between an apparatus for recording and another apparatus
    • H04N5/77Interface circuits between an apparatus for recording and another apparatus between a recording apparatus and a television camera

Definitions

  • VIDEO SYSTEM VIDEO SYSTEM, METHOD FOR CALIBRATING THE VIDEO SYSTEM AND METHOD FOR CAPTURING AN IMAGE USING
  • the present disclosure relates to the field of video systems.
  • the present disclosure relates to a video system, a method for calibrating the video system and a method for capturing an image using the video system.
  • Interactive video systems are rapidly gaining interest in various industries such as gaming, virtual reality, video conferencing, advertising, telemedicine, and multimedia in general. Many of such applications need to capture an image of a person located in front of a screen properly in order to integrate that image into a real-time video application. Image capture is frequently imperfect, for example when capture is done with the presence of light sources in the image field and especially impractical for a camera behind a projection screen when projecting video images in front of it.
  • Eye contact problems occur when participants in a videoconference typically orient their gaze towards their computer screen while the images shown to other participants are captured by cameras that are typically positioned above the screens, as in the case of most laptop computers. Consequently, eye-to-eye contact between the participants is virtually impossible. The lack of eye-to-eye contact is detrimental to the videoconferencing experience or to any interactive video system trying to present an interlocutor or an avatar with correct eye-to-eye contact.
  • a similar problem occurs when taking a so-called ‘selfie’ image using a conventional smartphone.
  • the person taking the image is typically looking at the screen and not at the camera hole positioned above the screen (or on a side of the screen, depending on how the smartphone is held). If the image is taken at close range, the gaze of the person in the image appears to not be looking into the displayed image.
  • a first solution consists of making a hole in the screen and capturing the face of the person. This solution only works if the person looks into the hole, thus restricting the flexibility of the system. Also, light sources behind or around the person like projectors can cause interference to the camera.
  • a second solution consists of using two or more cameras around the screen to obtain images of the person from several points of view, and then merging the images of the head in order to place the gaze at the right position on the screen (just in front of a tele-participant or avatar).
  • multiple projectors it is difficult to position the cameras around the screen surface in order to get a good view of the person. This task may be impossible if the person is too close to the screen in relation to the dimensions of the screen.
  • Some gaming applications or interactive systems require the individuals to be physically located very close to a projection screen in order for them to acquire a feeling of being immersed in the game or the virtual world. Depending on their position between a projector and the screen, the individuals or objects may cause a shadow to appear on the screen.
  • a video system comprising a perforated screen, a camera assembly, and a controller.
  • the camera assembly comprises a plurality of cameras, the camera assembly being positioned on a rear face of the perforated screen, apertures of each of the plurality of cameras being substantially aligned with a corresponding hole of the perforated screen.
  • the controller is operatively connected to the camera assembly.
  • the controller is configured to receive an image from each of the plurality of cameras of the camera assembly, and identify, based on the images received from each of the plurality of cameras, an object located in front of the perforated screen.
  • a method for calibrating the video system White light is projected through the perforated screen toward each camera of the camera assembly. Captures of the white light by the cameras of the camera assembly are compared. A luminance and a white balance of the captures by each of the cameras are calculated. For each camera, an adjustment of the luminance and of the white balance is calculated so that the luminance and the white balance are equalized within the camera assembly.
  • One of the cameras of the camera assembly is selected as a reference camera.
  • a calibration pattern is provided in front of the perforated screen for capture by the cameras of the camera assembly. Captures of the calibration pattern by the cameras are stored in memory.
  • the calibration pattern is moved in front of the perforated screen and the calibration pattern is captured at least a second time by the cameras.
  • the stored captures of the reference camera are compared with the stored captures of the other cameras of the camera assembly to estimate distortion parameters of the cameras of the camera assembly.
  • a geometrical adjustment sufficient to overcome the distortion parameters is calculated for each camera of the camera assembly.
  • a calibration pattern is provided in front of the perforated screen for capture by the cameras of the camera assembly. Captures of the calibration pattern by the cameras are stored in memory. The calibration pattern is moved in front of the perforated screen and the calibration pattern is captured by the cameras at least a second time. After at least two repetitions, the stored captures of the cameras of the camera assembly are compared with the calibration pattern to estimate distortion parameters of the cameras of the camera assembly. A geometrical adjustment sufficient to overcome the distortion parameters is calculated for each camera of the camera assembly.
  • the present disclosure further relates to a method for capturing an image using the video system.
  • An image is acquired by each camera of the camera assembly.
  • the luminance and white balance adjustment calculated for each camera is applied to the respective acquired images.
  • the geometric adjustment calculated for each camera is applied to the respective acquired images.
  • Figure 1 is a close-up, partial front elevation view of a conventional perforated front projection screen
  • Figure 2 is a rear perspective view of a camera assembly comprising a plurality of cameras according to an embodiment
  • Figure 3 is a front elevation view of the camera assembly of
  • Figure 4 is a partial view of an interactive video system comprising the camera assembly of Figure 2 mounted on a rear face of the perforated screen of Figure 1 according to an embodiment
  • Figure 5 is a schematic block diagram of the interactive video system of Figure 4 according to an embodiment
  • Figure 6 is a sequence diagram showing operations of a camera color calibration method according to an embodiment
  • Figure 7 is a sequence diagram showing operations of a geometric camera calibration method according to an embodiment
  • Figure 8 is a sequence diagram showing operations of a method for modifying a composite image with light source removal and/or object removal of according to an embodiment
  • Figure 9 is a sequence diagram showing operations of a dynamic shadow removal method in a multi-projector video system according to an embodiment
  • Figure 10 is a sequence diagram showing operations of a method for extracting a cut out model of an object facing the screen from a scene view according to an embodiment
  • Figure 11 is a sequence diagram showing operations of a method for removing from a video projection a shadow of an object located in front of the screen according to an embodiment
  • Figure 12 is an example of a white light source image captured by a first camera of the interactive video system of Figure 4, the first camera not being calibrated;
  • Figure 13a is an example of a white light source image captured by the first camera of the interactive video system of Figure 4, the first camera being calibrated using instructions from a manufacturer of the first camera;
  • Figure 13b is a representation of an influence of holes in the screen of Figure 1 on a luminance of the image of Figure 13a;
  • Figure 13c is a representation of an influence of holes in the screen of Figure 1 on a white temperature of the image of Figure 13a;
  • Figure 14a is an example of a white light source image captured by the first camera of the interactive video system of Figure 4, the first camera being calibrated using the method of Figures 6 and 7;
  • Figure 14b is a representation of an influence of holes in the screen of Figure 1 on a luminance of the image of Figure 14a;
  • Figure 14c is a representation of an influence of holes in the screen of Figure 1 on a white temperature of the image of Figure 14a;
  • Figure 15a is an image of a scene captured by the first camera of the interactive video system of Figure 4.
  • Figures 15b is a modified composite image of the scene of
  • Figure 15a the modified composite image being created by the interactive video system of Figure 4;
  • Figures 16a-16h are images of another scene respectively captured by the distinct cameras of the interactive video system of Figure 4.
  • Figures 17a and 17b are first and second modified composite images of the scene of Figures 16a to 16h, the first and second modified composite images being created by the interactive video system of Figure 4;
  • Figure 18 is a schematic representation of a method for removing the effect of a shadow in an overlapping zone of projectors on a screen according to an embodiment.
  • Various aspects of the present disclosure generally address one or more of the problems related to imperfect capture of images of objects or persons located in front of projection screens and to the emergence of shadows cause by the presence of objects or persons in front of projection screens. Embodiments of the present disclosure also allow to remove some objects or disturbing light spots dynamically from the captured images of objects or persons located in front of projection screens.
  • FIG. 1 is a close-up, partial front elevation view of a conventional perforated front projection screen.
  • a perforated screen 10 has a large number of holes 12 that cover at least about 5%, or more generally about 7% of the entire surface of the perforated screen 10.
  • the present technology uses the presence of these holes to hide a plurality of cameras behind the perforated screen 10.
  • Two or more cameras of a camera assembly are positioned on the rear face of the perforated screen 10, their apertures being substantially aligned with corresponding holes 12 of the perforated screen 10.
  • the cameras acquire images of an object, for example a person, located in front of the perforated screen 10.
  • Some projection screens are not perforated; the camera assembly may alternatively be positioned on the rear face of such screen in which holes are specifically pierced in front of each camera aperture.
  • the number and size of the holes of the perforated screen 10 may vary widely and do not limit usage of the present technology.
  • the number and size of the holes are mainly related to concerns that are not related to the present technology, for example the quality of the image projected on the perforated screen 10 and an expected distance between viewers and the perforated screen 10.
  • a controller receives images from the two or more cameras and builds a three-dimensional (3D) image (or model) of the object or person. The object or person is then identified by the controller.
  • the controller combines information from the received images to form a composite image of a scene captured by the cameras, the scene consisting generally of entities present in front of the perforated screen 10 and within a field of view of the cameras.
  • a model of the object or person may be constructed based on the composite image. This model may be added, possibly following a modification, to an image projected on the front of the perforated screen 10.
  • the model may be used to hide the object or person from the composite image, for example when it is desired to remove or modify the image of a given object or person from the composite image of the scene in front of the perforated screen 10 as captured by the cameras.
  • the model may also be used to remove a shadow of the object or person caused by the presence of the object or person between the perforated screen 10 and two or more projectors.
  • the present technology may provide interactivity between a video system and an object or a person positioned in front of the perforated screen 10.
  • Such an interactive video system may be used, without limitation, in virtual reality applications, in videoconferencing applications, in gaming applications, advertising applications, and other multimedia applications.
  • the present technology captures images of a scene and of an object or person facing the perforated screen 10 while other images are being projected on the perforated screen 10.
  • image will use the term “image” to refer images captured by the camera assembly, including images generated or modified by the video system and projected on the perforated screen 10.
  • projection will be used to refer to images projected by projectors on the perforated screen 10. This choice of terminology is made for the sole purpose of clarity and is not intended to limit the present disclosure.
  • object in the singular form, the present technology is adapted to capture a plurality of images, including moving images, or a plurality of objects, including a plurality of persons or a combination of objects and persons positioned in front of the perforated screen 10.
  • FIG 2 is a rear perspective view of a camera assembly comprising a plurality of cameras according to an embodiment.
  • Figure 3 is a front elevation view of the camera assembly of Figure 2.
  • Figure 4 is a partial view of an interactive video system comprising the camera assembly of Figure 2 mounted on a rear face of the perforated screen of Figure 1 according to an embodiment.
  • a camera assembly 20 includes nine camera supports 21-29 mounted on a board 30. Cameras 31-39 having respective apertures 41-49 are mounted on the camera assembly 20 using the camera supports 21-29.
  • the camera assembly 20 is positioned on a rear face of the perforated screen 10. The number and position of the cameras 31-39 within the camera assembly 20 may be configured according to the needs of a particular application.
  • a distance between two cameras may be substantially equal to a typical inter-eye distance of an adult.
  • Figures 2, 3 and 4 shows the cameras 31-39 generally forming a circle, with a central camera 35, other arrangements can be contemplated, for example horizontal or vertical linear groups of cameras.
  • a variant of the camera assembly 20 comprising a plurality of horizontally distributed cameras for better locating a person placed at various spots along the width of a large screen is also contemplated.
  • the apertures 41-49 of the cameras 31-39 are substantially aligned with corresponding holes 12 of the perforated screen 10.
  • apertures 41-49 of the cameras 31-39 are selected to be approximately equal to diameters of holes of the perforated screen 10. This alignment and this size match do not need to be perfect, and the cameras 31- 39 may not be perfect either; calibration techniques used to overcome imperfect alignment and size match between the apertures 41-49 and the holes 12 and to correct some imperfections of the cameras 31-39 are described hereinbelow.
  • Each camera 31-39 is connected to a controller ( Figure 5) by respective flat cables 51-57 and 59 (one flat cable connecting the camera 38 is not shown) and an intermediate connector 60.
  • Other manners of connecting the cameras 31-39 to the controller, including wired or wireless connections, are also contemplated, as the present technology is not dependent on any specific manner of connecting the cameras 31-39 to the controller.
  • the cameras 31-39 may be video cameras capable of capturing moving images.
  • the cameras 31-39 may include infrared cameras, black and white cameras and color cameras.
  • infrared cameras may be used to acquire images of an object positioned in front of the perforated screen 10 when a general area in front of the perforated screen 10 is relatively dark.
  • Color cameras may be used when it is desired to acquire or generate a color image or model of the object.
  • the camera assembly 20 may include more than one such camera type.
  • FIG. 5 is a schematic block diagram of the interactive video system of Figure 4 according to an embodiment.
  • An interactive video system 70 comprises a controller 72 connected to a number n of cameras of the camera assembly 20, including at least two cameras, for example the cameras 31-39. Two cameras of the camera assembly 20 are sufficient to provide 3D images of an object (including a person) positioned in front of the perforated screen 10 (shown on Figures 1 and 4). A larger number n of the cameras of the camera assembly 20 allows the controller 72 to obtain a better image resolution for the object.
  • the controller 72 includes a processor 74 (or a plurality of cooperating processors) operatively connected to a memory device 76 (or a plurality of cooperating memory devices) and to an input/output device 78 allowing to connect the controller 72 to the cameras 31-39 through the intermediate connector 60.
  • the controller 72 may also be connected, via the input/output device 78, to one or more projectors (projectors 86 and 88 are shown) and to a network interface 90 allowing the interactive video system 70 to interoperate with various remote systems.
  • the controller 72 may comprise plurality of cooperating input devices, output devices, and/or input/output devices adapted for communicating with the cameras 31-39 and with projectors 86 and 88 and the network interface 90, using wired or wireless connections.
  • the memory device 76 contains a non-transient memory 80 storing computer instructions that, when executed by the processor 74, allows the controller 72 to process images acquired from the cameras 31-39.
  • the memory device 76 may also store a table 82 containing factory information about characteristics of the cameras 31-39 and a table 84 containing values obtained following calibration of the cameras 31-39 and concerning parameters of the projectors 86 and 88 and of the screen 10.
  • the controller 72 receives an image from each of the cameras 31-39 of the camera assembly 20, and identify, based on the images received from each of the cameras 31-39, an object located in front of the perforated screen 10.
  • the controller 72 may search for the object when the object is located in front of a specific section of the perforated screen 10.
  • the controller 72 may segment the images received from each of the cameras SI- 39 into blocks, the images being for example segmented into distinct pixels.
  • the controller 72 may then perform a block-by-block (or pixel-by-pixel) comparison of the images received from the cameras 31-39 to identify the object based on differences found in these comparisons.
  • the controller 72 may determine a 3D location and a 3D shape of the object in front of the perforated screen 10.
  • the processor 74 (or the plurality of cooperating processors) has sufficient processing power to receive moving images from the cameras 31-39 and to track movements of the object in real time.
  • the controller 72 may use this information to generate or modify a projection presented on the perforated screen 10.
  • the controller 72 may generate a model of the object. This model may be identical to the captured image of the object.
  • the controller 72 may modify a primary image of the object in one or more of the images received from the cameras 31-39 and apply a modification to the primary image of the object to generate the model of the object.
  • the modification of the primary image used to form the model of the object may use information received at the controller 72 from the network interface 90.
  • the projector 86 or 88 is oriented toward a front surface of the perforated screen 10.
  • the controller 72 may cause the projector 86 or 88 to project a projection on the perforated screen 10, this projection could include the model of the object.
  • the object may be a gamer and the projector 86 of 88 may project a gaming scene on the perforated screen 10.
  • An image of the gamer may be added to the projection so that the gamer will see himself/herself on the projected image.
  • the projected image could show a real-life view of the gamer, or modify the image of the gamer to show the gamer wearing a garment (clothes, armor, helmet, and the like) consistent with the content of the gaming scene.
  • the object may be a customer buying clothes, the customer could be viewed in the projection as if wear wearing the clothes.
  • Other virtual reality applications could show a person in various environments projected on the perforated screen 10.
  • the projector 86 is configured to project the projection on at least a first portion 92 ( Figure 18) of the perforated screen 10 and the projector 88 is configured to project the same projection on at least a second portion 94 of the perforated screen 10, the first and second portions of the perforated screen 10 overlapping over a third portion 96 of the perforated screen so that the first and second portion together covering entirely the front surface of the perforated screen 10.
  • Information about relative 3D positions of the projectors 86, 88 and of the perforated screen 10 is stored in the memory device 76 of the controller 72. Having determined a location and a shape of the object, the controller 72 uses the relative positions of the projectors 86, 88 and of the perforated screen 10 to determine if a shadow is created by the object on the first portion of the perforated screen 10. If so, the controller 72 causes the projector 88 to increase a luminance of the projection in an area of the shadow on the second portion of the perforated screen. Conversely, if the controller 72 determines that a shadow is created by the object on the second portion of the perforated screen 10, the controller 72 causes the projector 86 to increase the luminance of the projection in an area of the shadow on the first portion of the perforated screen
  • the controller 72 may identify one or more light sources illuminating the front face of the perforated screen 10.
  • a light source may include, for example, one or both of the projectors 86 or 88. This identification of the light sources is made by the controller 72 based on differences found in block-by-block or pixel-by-pixel comparisons of the images received from the cameras 31-39.
  • the controller 72 may cancel light from the one or more light sources from the generated model of the object, thereby eliminating the interference caused by the one or more light sources.
  • the controller 72 may generate a model of the object (or person) based on the images received from the cameras 31-39 or based on a modification thereof, modify the composite image by adding the model of the object (or person) to the composite image, and cause the network interface 90 to transmit the modified composite image, which is expected to be received by a videoconferencing terminal of another person.
  • Another projection may be received from that other person via the network interface 90 and the controller 72 may cause one of the projectors 86 or 88 to project the received projection on the perforated screen 10.
  • the controller 72 may therefore generate a model of the object to be hidden, modify the composite image by removing the model of the object to be hidden from the composite image, and cause the network interface 90 to transmit the modified composite image without the hidden object.
  • calibration techniques are used to overcome imperfections in the alignment of the apertures 41-49 of the cameras 31-39 with the holes 12 of the perforated screen 10. Calibration may also be used to overcome differences between characteristics of each camera 31-39 and differences caused, for example, by imperfect alignment between the cameras 31-39 on the board 30 or from any interference occurring due to characteristics of the holes 12 or characteristics of the screen 10.
  • Figure 6 is a sequence diagram showing operations of a camera’s color calibration method according to an embodiment.
  • a sequence 100 comprises a plurality of operations, some of which may be executed in variable order, some of the operations possibly being executed concurrently, some of the operations being optional.
  • the camera assembly 20 is positioned on a rear face (or back side) of the perforated screen 10 such that the aperture 41-49 of each of the cameras 31- 39 is positioned substantially behind the center of a corresponding one of the holes 12.
  • a diffuse white light source is placed in front of the perforated screen 10 at operation 102 to project white light through the perforated screen 10 toward each camera 31-39 of the camera assembly 20.
  • White light may be projected simultaneously or successively toward each camera 31-39 of the camera assembly 20.
  • the cameras 31-39 capture images of the white light at operation 103.
  • the controller 72 compares a capture of the white light by the cameras 31-39 of the camera assembly 20 for computing a color correction for each camera 31-39. To obtain this color correction, the controller 72 calculates a luminance and a white balance of the captures by each of the cameras 31-39 and, for each camera, calculates an adjustment of the luminance and of the white balance so that the luminance and the white balance are equalized within the camera assembly 20.
  • the controller 72 stores the adjustments of the luminance and of the white balance for the cameras 31-39 in the table 84 of calibration values, in the memory device 76.
  • FIG. 7 is a sequence diagram showing operations of a geometric camera calibration method according to an embodiment.
  • a sequence 200 comprises a plurality of operations, some of which may be executed in variable order, some of the operations may be executed concurrently, some of the operations being optional.
  • the sequence 200 also includes operation 101.
  • one of the cameras 31-39 of the camera assembly 20, for example the camera 31 is selected as a reference camera and images of the other cameras 32-39 are adjusted to match images captured by the reference camera 31. While the present example places the reference camera 31 substantially at the center of the camera assembly 20, it is contemplated that any one of the cameras 31-39 may be selected as the reference camera.
  • a first position for a calibration pattern is provided in front of the perforated screen for capture by the cameras 31-39 of the camera assembly 20.
  • the cameras 31-39 capture images of the calibration pattern at this first position.
  • the images are forwarded to the controller 72, in which the processor 74 stores the captures of the first calibration pattern in the table 84 of calibration values, in the memory device 76.
  • Operation 203 may determine that operations 201 and 202 are to be repeated a number X of times to ensure a correct geometric camera calibration precision.
  • the number X is at least equal to two repetitions. In a non-limiting example, a typical number of repetitions is 10, which is appropriate to provide good accuracy of the geometric camera calibration is most cases.
  • the calibration pattern is moved to a different position than the previous one and the sequence 200 returns to operations 201 and 202, now executed with the next calibration pattern position. This process is executed for up to X different sequences of images.
  • operation 203 may determine that no further repetition of operations 201 and 202 is to be executed.
  • the controller 72 compares the stored captures of the reference camera 31 with the stored captures of the other cameras 32-39 of the camera assembly 20 to estimate distortion parameters of the other cameras 32-39 of the camera assembly 20.
  • the processor 74 may use the factory information about characteristics of the cameras 31-39 contained in the table 82 of the memory device 76 to better estimate these distortion parameters.
  • the controller 72 calculates a geometrical adjustment sufficient to overcome the distortion parameters.
  • the controller 72 stores the geographical adjustments in the table 84 of calibration values, in the memory device 76.
  • a virtual camera is defined, in this situation an actual representation of the calibration pattern being assumed as if captured by the virtual camera.
  • the stored captures of the cameras 31-39 are compared with the actual representation of the calibration pattern to estimate distortion parameters of each of the cameras 31-39.
  • the controller 72 calculates a geometrical adjustment sufficient to overcome the distortion parameters calculated for each camera 31-39.
  • the present technology allows the calibration of two cameras of the camera assembly 20. Better calibration becomes possible when the camera assembly 20 contains a larger number of cameras.
  • Figure 8 is a sequence diagram showing operations of a method for modifying a composite image with light source removal and/or object removal according to an embodiment.
  • a sequence 300 comprises a plurality of operations, some of which may be executed in variable order, some of the operations may be executed concurrently, some of the operations being optional.
  • Operation 301 comprises acquiring images by at least two cameras, or all cameras, of the camera assembly 20.
  • the luminance and white balance adjustments calculated for each of the cameras at operation 104 of the sequence 100 are applied by the controller 72 to the respective acquired images at operation 302. Then at operation 303, the controller 72 applies the geometric adjustment calculated for each of the cameras at operation 204 of the sequence 200 to the respective acquired images. Corrected images are obtained at operation 304.
  • operation 304 may follow operation 304, depending on whether it is desired to obtain a composite image of the scene as captured by the cameras 31-39 or a modified composite image in which a view of the scene is modified.
  • operation 311 may comprise the generation of a composite image based on a combination of the images captured by the cameras 31-39.
  • An output 312 comprises the composite image.
  • operation 305 may include detecting, by the controller 72, a light source and/or the contour of an object on the images acquired using at least two cameras of the camera assembly 20. The detections may then be propagated to images acquired by other cameras of the camera assembly 20 at operation 306, specifically if some of the cameras of the camera assembly 20 have not detected the light source and/or the contour of the object. The controller 72 may then calculate masks, for example pixel masks having the same resolution as that of the corrected images obtained at operation 304, for the images of the light source and/or of the object in the various images at operation 307.
  • masks for example pixel masks having the same resolution as that of the corrected images obtained at operation 304, for the images of the light source and/or of the object in the various images at operation 307.
  • the controller 72 generates a modified composite image, based on the corrected images obtained at operation 304, the light source and/or the object being optionally removed to generate a composite image of the scene as captured by the cameras 31-39.
  • image pixels contained in the masks representing the light source and/or the contour of the object may be removed from the composite image by the controller 72 to form the modified composite image.
  • the controller 72 may identify the light source and/or the object based on block-by-block or pixel-by- pixel comparison between the corrected images.
  • the object may be hidden from the images by substituting its pixels on some images with corresponding pixels of other images in which the object appears in a different position due to the relative positions of the various cameras 31 -39.
  • a view of the object may be highlighted in the images. If the object is moving, its relative positions in the image of the various cameras 31- 39 will also move.
  • the controller 72 may, in real-time, use the images from the various cameras to either continuously highlight the view of the object, or continuously hide the object.
  • An output 309 comprises the modified composite image.
  • an output 310 comprises an organized point cloud for the modified composite image. This point cloud has the same resolution as the modified composite image and further includes information about a distance of the object from the perforated screen, calculated as a function of a disparity of at least one given pixel captured by at least two cameras of the camera assembly 20.
  • a model of this first person produced at operation 310 may be oriented by the controller 72 so that eyes of the first person appear to be oriented toward a central focus of the camera assembly 20.
  • the controller 72 forwards this model via the network interface 90 to the corresponding videoconferencing terminal of a second person, so that a second person sees the model of the first person as if the first person was looking directly in the eyes of the second person.
  • FIG. 9 is a sequence diagram showing operations of a dynamic shadow removal method in a multi-projector video system according to an embodiment.
  • a sequence 400 comprises a plurality of operations, some of which may be executed in variable order, some of the operations may be executed concurrently, some of the operations being optional.
  • the sequence 400 starts by using the corrected images obtained at operation 304 and the organized point cloud for the modified composite image obtained at operation 310 of the sequence 300.
  • the sequence 400 is particularly useful in eliminating a potential shadow caused by the presence of the object (or person) between one of the projectors 86 or 88 and the perforated screen 10.
  • a 3D image of the object is detected and segmented in a sequence 500 (described below) using the intrinsic and extrinsic parameters of the cameras obtained at operation 204.
  • a predicted shadow on the video projector image is calculated in a sequence 600, (described below) using intrinsic, extrinsic, and video warping parameters of the video projectors 86 and 88 obtained at operation 402.
  • the intrinsic and extrinsic parameters of the cameras 31-39 as well as the intrinsic parameters of the projectors 86 and 88 may, for example, be obtained using a calibration procedure as described in International Patent Application Publication No.
  • WO2018/094513 A1 to Daniel Labonte etai published on May 31 , 2018, the disclosure of which is incorporated by reference herein in its entirety.
  • the parameters of the cameras 31-39 and of the projectors 86 and 88 may be stored in the table 82 of the memory device 76.
  • the controller 72 segments the image to be displayed on the perforated screen 10 by, for example, producing a number of image segments that correspond to a number of the projectors 86, 88, and of any other projector used in combination with the projectors 86 and 88.
  • Parameters used to perform this segmentation include the intrinsic and extrinsic parameters of the projectors, the model and a position of the perforated screen 10, and blended images obtained at operation 404.
  • the controller 72 causes the projectors 86, 88 and any other projector to project the image segments on the perforated screen 10.
  • the controller 72 having determined that the object is positioned to create a shadow on a projection from one of the projectors 86 or 88 on the surface of the perforated screen 10, calculates a blending of the projections from the two projectors 86 and 88.
  • the controller 72 causes the other of the projectors 86 or 88 to increase a luminance of the projection in an area of the shadow to compensate for the shadow effect.
  • a resulting projection that appears on the perforated screen 10 is substantially devoid of any shadow effect caused by the object.
  • FIG 10 is a sequence diagram showing operations of a method for extracting a cut out model of an object facing the screen from a scene view according to an embodiment.
  • a sequence 500 comprises a plurality of operations, some of which may be executed in variable order, some of the operations may be executed concurrently, some of the operations being optional.
  • the sequence 500 uses the corrected images obtained at operation 304 and the intrinsic and extrinsic camera parameters obtained at operation 204.
  • the controller 72 detects multiple views of an object from the images obtained from the cameras 31-39. A contour of the object is detected at operation 502. Using the organized point cloud for the modified composite image obtained at operation 310, the controller 72 segments the object at operation 503. The controller 72 then produces an object point cloud (model of the object) at operation 504 for use as the detected and segmented 3D image of the object in the sequence 400 ( Figure 9).
  • FIG 11 is a sequence diagram showing operations of a method for removing from a video projection a shadow of an object located in front of the screen according to an embodiment.
  • a sequence 600 comprises a plurality of operations, some of which may be executed in variable order, some of the operations may be executed concurrently, some of the operations being optional.
  • the controller 72 predicts the eventual, pixel-by-pixel, location of a shadow caused by the presence of the object in front of the perforated screen 10. To this end, the object point cloud (model of the object) obtained at operation 504 is retrieved at operation 505.
  • the object point cloud is projected on the perforated screen 10 at operation 601 , using intrinsic, extrinsic, and video warping parameters of the video projectors 86 and 88 obtained at operation 402 to warp the object.
  • a 2D convex hull of the object on the projected image is computed at operation 603.
  • a predicted object shadow on the projected image is calculated at operation 604.
  • 500 and 600 may be configured to be processed by one or more processors, the one or more processors being coupled to a memory device, for example the processor 74 and the memory device 76 of the controller 72.
  • Figure 12 is an example of a white light source image captured by a first camera of the interactive video system of Figure 4, the first camera not being calibrated.
  • An image 700 is captured using the camera 31 prior to any calibration.
  • An area 702 just above the center of the image 700 is somewhat lighter in its central area than on its periphery due to a phenomenon referred to as vignetting that causes a reduction of an image’s brightness on its periphery.
  • the vignetting may be caused in part by diffraction of the light from the white light source within the holes 12 of the perforated screen 10.
  • the image 700 is an inaccurate representation of the white light source (not shown) used to produce the image 700.
  • Figure 13a is an example of a white light source image captured by the first camera of the interactive video system of Figure 4, the first camera being calibrated using instructions from a manufacturer of the first camera.
  • Figure 13b is a representation of an influence of holes in the screen of Figure 1 on a luminance of the image of Figure 13a.
  • Figure 13c is a representation of an influence of holes in the screen of Figure 1 on a white temperature of the image of Figure 13a.
  • Figure 13a shows an image 710 captured using the same camera 31 that was used to capture the image 700 of Figure 9, with the same white light source.
  • a graph 714 shows the influence of holes in the screen of Figure 1 on the luminance of the image 710; the lighter area 712 is more readily visible on the graph 714.
  • a graph 716 shows the white temperature of the image of Figure 13a, the temperature varying significantly, with lower temperatures being found in the lighter area 712.
  • vertical and horizontal axes of the graphs 714 and 716 are in pixels.
  • the scale on the right-hand side of Figure 13b is an unsealed indication of the luminance.
  • the scale on the right-hand side of Figure 13b is a color temperature in Kelvin.
  • Figure 14a is an example of a white light source image captured by the first camera of the interactive video system of Figure 4, the first camera being calibrated using the method of Figures 6 and 7.
  • Figure 14b is a representation of an influence of holes in the screen of Figure 1 on a luminance of the image of Figure 14a.
  • Figure 14c is a representation of an influence of holes in the screen of Figure 1 on a white temperature of the image of Figure 14a.
  • the scales shown on Figures 14b and 14c are the same as in Figures 13b and 13c.
  • Figure 14a shows an image 720 captured using the same camera 31.
  • the image 720 has a uniform, light gray surface, and is free from any vignetting degradation.
  • a graph 722 shows that the luminance of the image 720 is very constant across its entire surface.
  • a graph 724 shows that a white temperature of the image 720 captured by the camera 31 when calibrated using the techniques of the present disclosure is very homogeneous. Similar results have been obtained from all cameras of the camera assembly 20 when calibrated in the same manner, the results being consistent between all cameras.
  • Figure 15a is an image of a scene captured by the first camera of the interactive video system of Figure 4.
  • the camera 31 has captured an image 800 showing a background 802 and a foreground 804.
  • a projector 806 is visible in the foreground 804.
  • the projector 806 generates a bright light spot 808 that interferes with the view of the background 802.
  • Figures 15b is a modified composite image of the scene of Figure 15a, the modified composite image being created by the interactive video system of Figure 4.
  • a modified composite image 810 shows the background 802 but hides the projector 806 present in the foreground 804. More particularly, interference from the light spot 808 generated by the projector 806 is completely removed from the modified composite image 810.
  • the controller 72 of the interactive video system 70 has identified the projector 806 and the light spot 808 in the various images captured by the cameras of the camera assembly 20 and has eliminated, in each of the captured images, pixels representing this projector 806 and the light spot 808 on each of the captured images. For example, if a given pixel from the camera 34 is very different from the same pixel from the cameras 31-33 and 35-39, that given pixel from the camera 34 may be eliminated from the modified composite image 810 by the controller 72. The modified composite image 810 was then formed of these modified captured images.
  • Figures 16a-16h are images of another scene respectively captured by the distinct cameras of the interactive video system of Figure 4.
  • Figures 16a to 16h are images captured by the cameras 32-39, respectively (no image from the reference camera 31 is shown).
  • a scene shows a toy lama 820 supported by a tower 822, and a tripod 826; a computer monitor 824 appears in the background. Owing to the different positions of the cameras 32-39 in the camera assembly 20, relative positions of the lama 820, of the tripod 826 and of the computer monitor 824 appear to change from one image to the other.
  • Figures 17a and 17b are first and second modified composite images of the scene of Figures 16a to 16h, the first and second modified composite images being created by the interactive video system of Figure 4.
  • Figure 17a presents an identification of an object present in the scene of Figures 16a- 16h, in the present case the toy lama 820
  • Figure 17b presents the scene of Figures 16a-16h in which this object is removed.
  • Figure 17a was obtained by causing the controller 72 of the interactive video system 70 to select the lama 820 and the tower 822 as the object of interest. The other elements of the images were eliminated by the controller 72 when forming a first modified composite image 830.
  • Figure 17b was obtained by causing the controller 72 of the interactive video system 70 to select the tripod 826 as the object of interest. The other elements of the images were eliminated by the controller 72 when forming a second modified composite image 840.
  • Figure 18 is a schematic representation of a method for removing the effect of a shadow in an overlapping zone of projectors on a screen according to an embodiment.
  • Figure 18 shows the perforated screen 10, the projectors 86 and 88, and a person 91 positioned in front of the perforated screen 10.
  • the projector 86 projects a projection in a first portion 92 of the perforated screen 10.
  • the projector 88 projects the same projection on a second portion 94 of the perforated screen 10.
  • These first and second portions 92 overlap in a broad, overlapping portion 96 of the perforated screen 10.
  • the overlapping portion 96 may fully cover the first and second portions 92 and 94 of the perforated screen 10.
  • the presence of the person 91 between the projector 86 and the first portion 92 causes the formation of a shadow 93 in the first portion 92.
  • the controller 72 determines a location and a shape of the person 91 , determines that a shadow is created by the person on the first portion 92 of the perforated screen 10, and causes the projector 88 to increase a luminance of the projection in the area of the shadow 93 on the second portion 94 of the perforated screen 10 to counter the effect of the shadow 93 that appears on the first portion 92 of the perforated screen 10.
  • the components, process operations, and/or data structures described herein may be implemented using various types of operating systems, computing platforms, network devices, computer programs, and/or general-purpose machines.
  • devices of a less general-purpose nature such as hardwired devices, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), or the like, may also be used.
  • FPGAs field programmable gate arrays
  • ASICs application specific integrated circuits
  • a method comprising a series of operations is implemented by a computer, a processor operatively connected to a memory, or a machine, those operations may be stored as a series of instructions readable by the machine, processor or computer, and may be stored on a non-transitory, tangible medium.
  • Systems and modules described herein may comprise software, firmware, hardware, or any combination(s) of software, firmware, or hardware suitable for the purposes described herein.
  • Software and other modules may be executed by a processor and reside on a memory of servers, workstations, personal computers, computerized tablets, personal digital assistants (PDA), and other devices suitable for the purposes described herein.
  • Software and other modules may be accessible via local memory, via a network, via a browser or other application or via other means suitable for the purposes described herein.
  • Data structures described herein may comprise computer files, variables, programming arrays, programming structures, or any electronic information storage schemes or methods, or any combinations thereof, suitable for the purposes described herein.

Abstract

L'invention concerne un système vidéo comprenant un écran perforé, un ensemble caméra et un dispositif de commande connecté de manière fonctionnelle à l'ensemble caméra. L'ensemble caméra comprend une pluralité de caméras et est positionné sur une face arrière de l'écran perforé. Des ouvertures de chacune de la pluralité de caméras sont sensiblement alignées avec un trou correspondant dans l'écran perforé. Le dispositif de commande reçoit une image provenant de chacune de la pluralité de caméras de l'ensemble caméra et identifie, sur la base des images reçues de chacune des caméras, un objet situé devant l'écran perforé. Le dispositif de commande peut générer une image composite d'une scène capturée par les caméras. Le dispositif de commande peut identifier une source de lumière éclairant l'écran perforé. Le dispositif de commande peut annuler la source de lumière ou l'objet à partir de l'image composite. Un motif d'étalonnage peut être utilisé pour ajuster géométriquement chaque caméra de l'ensemble caméra.
PCT/CA2020/051661 2019-12-04 2020-12-03 Système vidéo, procédé d'étalonnage du système vidéo et procédé de capture d'une image à l'aide du système vidéo WO2021108913A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201962943445P 2019-12-04 2019-12-04
US62/943,445 2019-12-04

Publications (1)

Publication Number Publication Date
WO2021108913A1 true WO2021108913A1 (fr) 2021-06-10

Family

ID=76220857

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2020/051661 WO2021108913A1 (fr) 2019-12-04 2020-12-03 Système vidéo, procédé d'étalonnage du système vidéo et procédé de capture d'une image à l'aide du système vidéo

Country Status (1)

Country Link
WO (1) WO2021108913A1 (fr)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1977593A1 (fr) * 2006-01-12 2008-10-08 LG Electronics Inc. Traitement vidéo multivue
WO2012177643A2 (fr) * 2011-06-22 2012-12-27 Microsoft Corporation Étalonnage de modèle articulé dynamique totalement automatique
US8675043B2 (en) * 2006-04-25 2014-03-18 Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. Image recording system providing a panoramic view
CN203966475U (zh) * 2014-04-30 2014-11-26 深圳市联建光电股份有限公司 具有多个摄像头的led显示系统
EP3265999A2 (fr) * 2015-03-01 2018-01-10 NEXTVR Inc. Procédés et appareil de réalisation de mesures environnementales et/ou d'utilisation de ces mesures dans un rendu d'image 3d
WO2018094513A1 (fr) * 2016-11-23 2018-05-31 Réalisations Inc. Montréal Système et procédé de projection à étalonnage automatique

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1977593A1 (fr) * 2006-01-12 2008-10-08 LG Electronics Inc. Traitement vidéo multivue
US8675043B2 (en) * 2006-04-25 2014-03-18 Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. Image recording system providing a panoramic view
WO2012177643A2 (fr) * 2011-06-22 2012-12-27 Microsoft Corporation Étalonnage de modèle articulé dynamique totalement automatique
CN203966475U (zh) * 2014-04-30 2014-11-26 深圳市联建光电股份有限公司 具有多个摄像头的led显示系统
EP3265999A2 (fr) * 2015-03-01 2018-01-10 NEXTVR Inc. Procédés et appareil de réalisation de mesures environnementales et/ou d'utilisation de ces mesures dans un rendu d'image 3d
WO2018094513A1 (fr) * 2016-11-23 2018-05-31 Réalisations Inc. Montréal Système et procédé de projection à étalonnage automatique

Similar Documents

Publication Publication Date Title
US11115633B2 (en) Method and system for projector calibration
CN101884222A (zh) 用于支持立体呈现的图像处理
US9807372B2 (en) Focused image generation single depth information from multiple images from multiple sensors
US11425283B1 (en) Blending real and virtual focus in a virtual display environment
CN110691175B (zh) 演播室中模拟摄像机运动跟踪的视频处理方法及装置
CN101971211A (zh) 用于修改数字图像的方法和设备
US11615755B1 (en) Increasing resolution and luminance of a display
WO2021108913A1 (fr) Système vidéo, procédé d'étalonnage du système vidéo et procédé de capture d'une image à l'aide du système vidéo
US20230171508A1 (en) Increasing dynamic range of a virtual production display
US10853993B2 (en) System and method for producing images for display apparatus
Peng et al. Studying user perceptible misalignment in simulated dynamic facial projection mapping
TW201820856A (zh) 可檢測投影機的投影影像清晰度的檢測系統及其檢測方法
EP4345805A1 (fr) Procédés et systèmes de commande de l'aspect d'une paroi à del et de création d'images de caméra augmentées numériquement
WO2023094882A1 (fr) Augmentation de la plage dynamique d'un écran de production virtuelle
WO2023094872A1 (fr) Augmentation de la plage dynamique d'un dispositif d'affichage de production virtuelle
WO2023094873A1 (fr) Augmentation de la plage dynamique d'un dispositif d'affichage de production virtuelle
WO2023094870A1 (fr) Augmentation de la plage dynamique d'un écran de production virtuelle
WO2023094875A1 (fr) Augmentation de la plage dynamique d'un afficheur de production virtuelle
WO2023094880A1 (fr) Augmentation de la plage dynamique d'un dispositif d'affichage de production virtuelle
WO2023094876A1 (fr) Augmentation de la plage dynamique d'un écran de production virtuelle
WO2023094877A1 (fr) Augmentation de la plage dynamique d'un écran de production virtuelle
WO2023094878A1 (fr) Augmentation de la plage dynamique d'un dispositif d'affichage de production virtuelle
WO2023094881A1 (fr) Augmentation de la plage dynamique d'un afficheur de production virtuelle

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20896057

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20896057

Country of ref document: EP

Kind code of ref document: A1