WO1997039581A1 - Method and apparatus for presenting in a computer display images representing remote moving objects - Google Patents

Method and apparatus for presenting in a computer display images representing remote moving objects Download PDF

Info

Publication number
WO1997039581A1
WO1997039581A1 PCT/US1997/006352 US9706352W WO9739581A1 WO 1997039581 A1 WO1997039581 A1 WO 1997039581A1 US 9706352 W US9706352 W US 9706352W WO 9739581 A1 WO9739581 A1 WO 9739581A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
image signal
signal
total
visual
Prior art date
Application number
PCT/US1997/006352
Other languages
French (fr)
Inventor
Jack J. Campbell
Original Assignee
Campbell Jack J
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Campbell Jack J filed Critical Campbell Jack J
Publication of WO1997039581A1 publication Critical patent/WO1997039581A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Definitions

  • the invention relates to communication systems in which moving visual images such as those represented by physically-remote generated video or television signals on a display, for example, a computer display, such that only the object itself appears to be in front of or on top of other visual elements shown on the display.
  • One penalty is that the presentation of an entire visual-image in an area of a display obscures everything else that would otherwise appear in the display within that area.
  • a second penalty is that a greater bandwidth is required to carry the entire visual-image signal than would be needed to carry just the image of the portion of interest.
  • a viewer is generally not interested in the background behind another person participating in the conference. If the image of only the other person could be presented on a display, other visual elements that are in proximity to the person's image as presented on the display would still be visible.
  • the bandwidth or storage space required to convey or store the person's image would be less than that required to convey or store the entire visual-image signal.
  • the inventor has determined that displays of a person's image can be improved if the background can be removed from the displayed image. By doing so, confusion caused by background clutter can be avoided and, because the overall size ofthe image can be limited to just the person, the displayed image need not be translucent.
  • Techniques which permit foreground objects to be separated from the background and shown in front of another scene.
  • Techniques such as so called chroma-keying and luminance keying, for example, permit foreground objects in one scene to be separated from the original background and combined with another scene.
  • a third video signal can be generated from a first video signal of a person in a studio and a second video signal of another location such as a mountain range.
  • the resulting third video signal when presented on a suitable display, creates the impression that the person is actually in front ofthe mountain range.
  • a form of chroma-key switching is disclosed in Nakamura, SMPTE Journal, vol. 90, February 1981, p.
  • one method of presenting an image includes receiving at a physically-remote location a live total-image signal representing an image of a moving object such as a person in front of a background, receiving at the physically-remote location a control signal indicating what portion ofthe total-image signal corresponds to the image of only the object, generating at the physically-remote location an object-image signal in response to the total-image signal and the control signal such that the object-image signal includes image information representing the image of only the object and includes decoding information needed to decode the image information and reconstruct the image ofthe object on a display, transmitting the object-image signal from the physically- remote location, receiving the object-image signal and decoding the image information according to the decoding information so as to generate a visual-image signal representing the image ofthe moving object, and presenting in a computer display one or more "visual elements" and, in response to the visual-image signal, a representation ofthe moving object such that the image ofthe moving object without background appears in front of
  • computer display and the like refers to devices incorporating a visual display controlled by a digital processor
  • transmitting refers to any form of conveying signals such as electrical, electro-magnetic or optical signals, either analog or digital, between physically remote locations.
  • the present invention may be incorporated into a variety of methods and systems.
  • teleconferencing usually entails two-way or bidirectional transmission and display as discussed above; however, it should be understood that the present invention may be used in a broad range of applications which present a visual image in a computer display.
  • the various features ofthe present invention and its preferred embodiments may be better understood by referring to the following discussion and the accompanying drawings in which like features are referred to by like numbers
  • the contents ofthe following discussion and the drawings are set forth as examples only and should not be understood to represent limitations upon the scope ofthe present invention.
  • Fig. 1 is a schematic block diagram of one embodiment of a system according to the present invention.
  • Fig. 2a is a schematic block diagram of a component for generating a control signal using image difference keying.
  • Fig. 2b is a schematic block diagram of a component for adapting a reference image for use with image difference keying.
  • Fig. 3a is a schematic representation of a computer display presenting several visual elements and an image of object with background in a conventional window.
  • Fig. 3b is a schematic representation of a computer display presenting several visual elements and an image of object without background.
  • Fig. 4 is a schematic block diagram of a typical computer system.
  • a live total-image signal representing an image of a moving object such as a person in front of a background is received from path 10.
  • Control 200 receives the total-image signal and, in response, generates a control signal along path 20 that indicates what portion ofthe total-image signal corresponds to the image of only the object.
  • Generator 300 receives both the total-image signal and the control signal and, in response, generates an object-image signal along path 30 that includes image information representing the image of only the object and decoding information needed to decode the image information and reconstruct the image on a display.
  • the broken line between path 30 and path 31 represents a communications path along which the signal is conveyed between physically-remote locations.
  • Display 400 receives the object-image signal, including the image information and the decoding information, from path 31 and generates a visual-image signal by decoding the image information according to the decoding information.
  • Display 400 also receives signals from path 40 representing one or more visual elements for presentation.
  • display system 400 presents the one or more visual elements and a representation ofthe moving object such that the representation appears as the image ofthe object without background in front ofthe visual elements.
  • the representation ofthe object also may be presented in such a way that it appears to be behind some visual elements.
  • Various components such as power supplies, filters, oscillators and delay elements, necessary in practical embodiments are not illustrated in this and other figures in order to more clearly show features ofthe present invention.
  • Display system 400 may be implemented in a wide variety of ways; however, it is contemplated that preferred implementations will inco ⁇ orate some form of microprocessor- controlled display such as that found in conventional desktop computers or microprocessor- based terminals.
  • the live total-image signal received from path 10 represents a moving object and its background.
  • this signal is a conventional video signal such as that which can be obtained from a video camera or a television transmission.
  • the location and/or appearance ofthe object is changing such that the image of the object represented in the total-image signal is also changing.
  • Control Signal The control signal received from path 20 indicates what portion ofthe total-image signal corresponds to the image of only the object. In practical implementations it is usually difficult if not impossible to identify the portion ofthe signal that corresponds exactly to the object image. Some error is encountered at the edges. References herein to the image of "only the object” and the like should be understood to mean the image ofthe object with allowance for some error at the edges ofthe image. No particular method for obtaining the control signal is critical to the practice of the present invention. For example, techniques such as those used in the various forms of chroma- keying, luminance keying and video difference keying may be used. The choice of technique will be influenced by the nature of the object and particularly by the nature of the background behind the object.
  • chroma-key switching may be used. These techniques are attractive in that they can be implemented inexpensively, however, the quality ofthe resulting images are degraded by obtrusive edge effects.
  • Techniques used in chroma-key linear compositing may also be used. These techniques avoid many ofthe edge effects caused by various forms of chroma-key switching and they are also able to manage greater variations in color and lighting ofthe foreground. Techniques such as video difference keying do not require a background that is essentially monochromatic and uniformly illuminated. Backgrounds with such characteristics are not commonplace; therefore, difference keying is preferred in many applications.
  • a reference image for difference keying may be stored from an input video signal or it may be synthesized under operator control.
  • a reference image may also be constructed over time by comparing frames ofthe total-image signal across some interval of time, identifying portions ofthe image that do not change, and using those portions to construct the reference image. For example, portions of a background behind a person could be quickly constructed from areas not obscured by the person, and other portions ofthe background could be constructed as the person moved to reveal those portions.
  • Fig. 2a illustrates one example of a component that can be used to generate a control signal using image difference keying.
  • Init 202 initializes reference image store 204 through switch 203 along path 14 from either the total-image signal received from path 10 or from some other signal received from path 12. If reference image store 204 is implemented by a delay line or other storage technique requiring periodic refresh, the stored image is recirculated from store output path 16 along path 18 to store input path 14; otherwise, path 18 is not required.
  • Comparator 206 generates along path 20 a difference signal obtained from the difference between the total-image signal and a reference image signal received from path 16.
  • Fig. 2a illustrates a variation ofthe component shown in Fig. 2a that can adapt to slowly occurring changes.
  • amplifier 212 applies a gain g to the difference signal received through switch 210 from path 20, where 0 ⁇ g ⁇ 1, and passes the amplified signal to summing circuit 208.
  • Summing circuit 208 generates a sum signal along path 18 by combining the reference image signal received from path 16 with the amplified signal received from amplifier 212.
  • Switch 210 is controlled by a signal generated along path 28 by motion-detection circuit 220.
  • Difference circuit 224 generates for each pixel a signal representing the absolute value of the difference between the total-image signal received from path 10 and a delayed version of the total-image signal received through frame store 222.
  • Switch 228 is controlled by a signal received from threshold 226. When the absolute value ofthe difference from circuit 224 exceeds a level defined by threshold 226, switch 228 is caused to connect to path 22 which conveys a signal representing the value one; otherwise, switch 228 is caused to connect to path 24 which allows the output of frame store 232 to be fed back to the input of frame store 232 through amplifier 230.
  • the gain a of amplifier 230 causes the signal in feedback path 24 to decay while switch 228 is connected to path 24.
  • Threshold 234 generates a signal along path 28 in response to the signal received from path 26.
  • the signal generated along path 28 indicates motion is present for a respective pixel; otherwise, the signal generated along path 28 indicates there is no motion.
  • switch 228 connects to path 22 causing a value of one to be fed to frame store 232 and to be passed along path 26 to threshold 234.
  • threshold 234 In response thereto, threshold 234 generates a signal along path 28 indicating motion is present for that respective pixel.
  • switch 228 connects to path 24 and allows the signal stored in frame store 232, which corresponds to that respective pixel, to decay from one to zero
  • the gain a of amplifier 230 is set so that the signal level drops below the level defined by threshold 234 after about 10 to 20 seconds. If a sufficient difference between adjacent frames for that respective pixel is detected before the signal on path 26 falls below the level defined by threshold 234, the signal is immediately set to a value of one, as explained above.
  • the signal generated along path 28 will indicate continuous movement for a respective pixel if differences for that respective pixel exceed the level of threshold 226 at least once during the decay period established by amplifier 230, for example, once every 10 to 20 seconds.
  • switch 210 is caused to open, thereby causing the amplified signal from amplifier 212 to go to zero.
  • the sum signal along path 18 for the respective pixel in reference image store 204 represents the current value of that pixel; that is, the respective pixel in reference image store 204 is not altered.
  • motion-detection circuit 220 causes switch 210 to close, thereby causing the respective pixel in reference image store 204 to be adapted by the amplified signal.
  • the index / ' represents time in general but more particularly, in preferred embodiments, represents a particular video frame; thus, p(i) represents a respective pixel in one video frame and/?(/-l) represents that respective pixel in the immediately previous video frame.
  • Object-Image Signal A variety of techniques may be used to generate image information and decoding information portions ofthe object-image signal. The nature of the control signal will often influence and may even dictate the technique for generating the image information portion of the object-image signal.
  • an image-only signal may be generated by extracting intervals ofthe total-image signal corresponding to intervals ofthe control signal which are off; i.e., not background.
  • the image-only signal may be generated by multiplying the total-image signal by the control signal, where the control signal is substantially zero for portions of the total-object signal corresponding to the background and is substantially one for portions corresponding to the object.
  • the object-only signal may be generated by extracting portions ofthe total-image signal for portions where the control signal exceeds a threshold.
  • the image information portion of the object-image signal may be obtained by encoding the object-only signal according to some scheme such as a form of run-length-limited (RLL) coding in which a code represents the interval between one or more references and the edges of the object image.
  • RLL run-length-limited
  • the encoding process accompanies an adaptive clipping ofthe total-image signal so as to stabilize the position and size of the moving object.
  • the encoding process can include some form of data or bandwidth compression such as, for example, processes conforming to various MPEG and JPEG standards, or processes which exploit redundancy between video frames. Either lossy and/or lossless compression may be used.
  • a visual-image signal representing only the image ofthe moving object may be obtained by receiving the object-image signal from path 31, extracting the image information and the decoding information from the object-image signal, and decoding the image information according to the decoding information. For example, if the image information was processed by some form of data or bandwidth compression, the visual-image signal is derived using a complementary expansion.
  • the decoding process may also include an adaptive clipping ofthe derived visual-image signal so as to stabilize the position and size ofthe moving object.
  • Adaptive clipping may be performed with the decoding process in addition to or instead of any adaptive clipping performed with the encoding process.
  • FIGs. 3a and 3b each illustrate a viewing area 500 on a computer display that presents a horizontal menu 502 across the top as part of a metaphoric representation of a desktop.
  • a portion of a memo is presented in a rectangular area with a frame known as a "window.”
  • This first window 510 overlaps a second window 520 containing a portion of an electronic spreadsheet and a bar graph.
  • an image 532 of a person is shown in the lower right-hand corner.
  • a "visual element" either presents results of operation by the computer or it provides facilities by which operation ofthe computer can be affected.
  • Menu 502 the desktop, and windows 510 and 520 with scroll bars, menus and display areas are examples of visual elements.
  • Fig. 4 illustrates a schematic block diagram of a computer-based display system.
  • Display processor 402 generates a presentation on display 404 in response to the visual-image signal received from path 31 and signals received from path 40 representing one or more visual elements.
  • display processor 402 synchronizes and merges the signals received from paths 31 and 40, which can be accomplished using techniques well known in the art
  • the signals representing visual elements are generated by software executed by a processor such as processor 408.
  • Operation of processor 408 may be controlled by operator request received through input 406 which represents devices such as a keyboard and/or a pointing device such as a mouse or trackball. No particular computer, operating system or software environment is critical to the practice ofthe present invention.
  • Examples of computers that can present the visual-image signal and one or more other visual elements include IBM ® PC-compatible computers, Macintosh ® computers and Sun ® workstations that each comprise a system unit and a display for presenting visual images under control ofthe system unit.
  • Examples of operating systems and software environments include those offered under the trademarks "Windows,” “OS/2,” “OS/2 Presentation Manager,” “OSF/Motif,” “DESQview,” “GEOS,” and “Macintosh Finder " The products provide various forms of a user interface that employs display ports called windows It should be appreciated that the present invention is not limited to any particular user interface For ease of discussion, however, the following discussion is more particularly directed to a so called graphical user interface (GUI) similar to that shown in the figures
  • Fig 3a illustrates a screen display of a conventional system in which image 532 of a person and a background are presented in a third window 530
  • image 532 itself does not overlap windows 510 and 520
  • the frame of window 530 and the portion presenting the background do overlap and obscure some ofthe information within windows 510 and 520
  • the overlap could be avoided by merely reducing the size of window 530, assuming that the system implementation permits the size of window 530 to be changed
  • overlap would be avoided only after reducing the size of window 530 to a point where image 532 would be very small and difficult to see clearly
  • the image of a moving object may be presented on a computer display without its background because the visual-image signal received from path 31 represents the image of only the object.
  • image 532 in this display which has the same relative size and location as image 532 illustrated in Fig 3a, does not overlap and obscure information within either window 510 or window 520
  • Screen display 500 shown in Fig 3b presents image 532 with a large size so that it can be seen clearly yet not obscure other information
  • a clock appears behind image 532 in the screen display shown in Fig 3a but the image ofthis clock does not appear in the screen display shown in Fig 3b This may be accomplished by excluding the clock image from the object-image signal
  • the clock image may be excluded by adapting the reference image to slowing occurring changes in the total image, as described above
  • the amount of information required to represent the image of only the moving object is usually considerably less than the amount of information required to represent the object and its background.
  • the object-image signal can be conveyed using less bandwidth or stored using less space that would be required to convey or store the total-image signal.
  • an operator is able to manipulate the displayed image in a variety of ways including moving, resizing, hiding, maximizing and minimizing. Such operations may be performed, for example, by a combination of actions using a keyboard and/or a pointing device.
  • the term “moving” refers to changing the location ofthe image within the screen display.
  • the term “resizing” refers to changing the displayed dimensions ofthe image.
  • the term “hiding” refers to obscuring the image with one or more other visual elements
  • the term "maximizing” refers to enlarging the size ofthe image so that it fills the display screen; preferably, all other visual elements in the display are obscured by a user-selectable background when an image is maximized.
  • the term "minimizing" refers to replacing the image with a small visual element so that the image is effectively removed from the screen.
  • the small visual element may have a static appearance or it may include a dynamic appearance such as a very small presentation ofthe image.
  • some visually distinctive feature indicates when an object image is selected for moving or resizing.
  • a frame and/or opaque background could be added to the image presentation.
  • the frame could be a conventional frame such as that shown for window 530 or it could be a distinctive frame presented, for example, as a rectangle formed by faint or broken lines.
  • One way in which such a presentation may be accomplished is to display the moving object image in a frameless window with a transparent background.
  • the frameless window can be manipulated in a manner similar to that for more conventional opaque windows with frames.
  • the cursor of a pointing device could change in appearance depending on whether the cursor points inside the window, at the edge ofthe window, or outside the window.
  • the cursor could be a single-headed arrow when outside the window, a double-headed arrow when at the edge ofthe window indicating that the window can be resized, and a four-headed arrow when inside the window indicating that the window can be moved.

Abstract

In a communication system, an image of a physically-remote moving object conveyed by signals such as video or television signals is presented in a computer display such that the image of the object is presented without its background obscuring other visual elements presented in the display. At a physically-remote location, an object-image signal representing only the image of a moving object is generated in response to a total-image signal representing both object and background and a control signal indicating what portion of the total-image signal corresponds to the image of only the object. The object-image signal includes encoded image information representing the image of the moving object and includes decoding information needed to decode the image information and reconstruct the image of the object. The object-image signal is transmitted from the physically-remote location to a computer display in which a visual-image signal is obtained by decoding the encoded image information and then presenting the visual-image signal together with other visual elements in such a way that the moving object without background appears in front of the other visual elements.

Description

DESCRIPTION
METHOD AND APPARATUS FOR PRESENTING IN A COMPUTER DISPLAY IMAGES REPRESENTING REMOTE MOVING OBJECTS
TECHNICAL FIELD
The invention relates to communication systems in which moving visual images such as those represented by physically-remote generated video or television signals on a display, for example, a computer display, such that only the object itself appears to be in front of or on top of other visual elements shown on the display.
BACKGROUND ART
There is a growing interest in the use of computers to facilitate communication between physically remote locations. In particular, there is a growing interest in using computer displays to show physically-remote generated visual-image signals such as video or television signals. Such usage finds application in viewing and/or producing visual programs, in teleconferencing and broadcasting. Some examples of products which may be used in computer systems to present video signals on a computer display or within a portion or "window" of a computer display include VDOPhone™ by VDOnet Corporation of Palo Alto, California, ATI-TV by ATI Technologies Inc. of Toronto, Canada, and various models of Video Vision by Radius Inc. of Sunnyvale, California. These systems present the entire visual-image signal in an area of a display, typically a rectangular area.
Unfortunately, these systems impose some penalties. One penalty is that the presentation of an entire visual-image in an area of a display obscures everything else that would otherwise appear in the display within that area. A second penalty is that a greater bandwidth is required to carry the entire visual-image signal than would be needed to carry just the image of the portion of interest. In a teleconferencing system, for example, a viewer is generally not interested in the background behind another person participating in the conference. If the image of only the other person could be presented on a display, other visual elements that are in proximity to the person's image as presented on the display would still be visible. Furthermore, the bandwidth or storage space required to convey or store the person's image would be less than that required to convey or store the entire visual-image signal.
One method for displaying a person's face on a computer display without hiding other visual images is disclosed in Ishii and Arita, "Clearface: Translucent Multiuser Interface for TeamWorkStation," Proc. of Second European Conf. on Computer-Supported Cooperative Work, Sept. 1991, pp. 163-174. According to this method, the facial image of a collaborator or co-worker overlays a drawing surface displayed on a computer display in such a way that drawings can be seen through the face image. According to the authors, this method requires that the background ofthe facial image be clean, otherwise it is difficult for viewers to distinguish drawings from background clutter.
The inventor has determined that displays of a person's image can be improved if the background can be removed from the displayed image. By doing so, confusion caused by background clutter can be avoided and, because the overall size ofthe image can be limited to just the person, the displayed image need not be translucent.
Techniques are known which permit foreground objects to be separated from the background and shown in front of another scene. Techniques such as so called chroma-keying and luminance keying, for example, permit foreground objects in one scene to be separated from the original background and combined with another scene. For example, by using such techniques, a third video signal can be generated from a first video signal of a person in a studio and a second video signal of another location such as a mountain range. The resulting third video signal, when presented on a suitable display, creates the impression that the person is actually in front ofthe mountain range. A form of chroma-key switching is disclosed in Nakamura, SMPTE Journal, vol. 90, February 1981, p. 107, and forms of chroma-key linear compositing are disclosed in U.S. patents 4,589,013 and 4,625,231. These references are incorporated herein by reference in their entirety. A significant limitation to these techniques, however, is that the foreground objects must be placed in front of a uniform background that is essentially monochromatic. Such a background is not commonplace; therefore, these techniques cannot be used in teleconferencing for normal office environments. U.S. patent 4,800,432, incorporated herein by reference in its entirety, describes a technique referred to therein as video difference keying. Video difference keying avoids the need for uniform backgrounds by generating a key from a pixel-by-pixel difference between i n input video signal and a stored reference image. The patent describes this technique as one tl it can generate a key signal which can be used in the compositing of video images; however, it does not disclose or suggest how video difference keying can be used to improve computer displays of video images. DISCLOSURE OF INVENTION
It is an object ofthe present invention to provide for a method and a system for presenting images on a computer display which overcome the problems discussed above.
In accordance with the teachings ofthe present invention, one method of presenting an image includes receiving at a physically-remote location a live total-image signal representing an image of a moving object such as a person in front of a background, receiving at the physically-remote location a control signal indicating what portion ofthe total-image signal corresponds to the image of only the object, generating at the physically-remote location an object-image signal in response to the total-image signal and the control signal such that the object-image signal includes image information representing the image of only the object and includes decoding information needed to decode the image information and reconstruct the image ofthe object on a display, transmitting the object-image signal from the physically- remote location, receiving the object-image signal and decoding the image information according to the decoding information so as to generate a visual-image signal representing the image ofthe moving object, and presenting in a computer display one or more "visual elements" and, in response to the visual-image signal, a representation ofthe moving object such that the image ofthe moving object without background appears in front ofthe visual elements.
It should be understood that the term "computer display" and the like refers to devices incorporating a visual display controlled by a digital processor, and the term "transmitting" refers to any form of conveying signals such as electrical, electro-magnetic or optical signals, either analog or digital, between physically remote locations.
The present invention may be incorporated into a variety of methods and systems. Throughout this discussion, more particular mention is made of teleconferencing which usually entails two-way or bidirectional transmission and display as discussed above; however, it should be understood that the present invention may be used in a broad range of applications which present a visual image in a computer display. The various features ofthe present invention and its preferred embodiments may be better understood by referring to the following discussion and the accompanying drawings in which like features are referred to by like numbers The contents ofthe following discussion and the drawings are set forth as examples only and should not be understood to represent limitations upon the scope ofthe present invention. BRIEF DESCRIPTION OF DRAWINGS
Fig. 1 is a schematic block diagram of one embodiment of a system according to the present invention.
Fig. 2a is a schematic block diagram of a component for generating a control signal using image difference keying.
Fig. 2b is a schematic block diagram of a component for adapting a reference image for use with image difference keying.
Fig. 3a is a schematic representation of a computer display presenting several visual elements and an image of object with background in a conventional window. Fig. 3b is a schematic representation of a computer display presenting several visual elements and an image of object without background.
Fig. 4 is a schematic block diagram of a typical computer system.
MODES FOR CARRYING OUT THE INVENTION Overview
Referring to Fig. 1 ofthe drawings, a live total-image signal representing an image of a moving object such as a person in front of a background is received from path 10. Control 200 receives the total-image signal and, in response, generates a control signal along path 20 that indicates what portion ofthe total-image signal corresponds to the image of only the object. Generator 300 receives both the total-image signal and the control signal and, in response, generates an object-image signal along path 30 that includes image information representing the image of only the object and decoding information needed to decode the image information and reconstruct the image on a display. The broken line between path 30 and path 31 represents a communications path along which the signal is conveyed between physically-remote locations. Display 400 receives the object-image signal, including the image information and the decoding information, from path 31 and generates a visual-image signal by decoding the image information according to the decoding information. Display 400 also receives signals from path 40 representing one or more visual elements for presentation. In response to these signals, display system 400 presents the one or more visual elements and a representation ofthe moving object such that the representation appears as the image ofthe object without background in front ofthe visual elements. The representation ofthe object also may be presented in such a way that it appears to be behind some visual elements. Various components such as power supplies, filters, oscillators and delay elements, necessary in practical embodiments are not illustrated in this and other figures in order to more clearly show features ofthe present invention.
Display system 400 may be implemented in a wide variety of ways; however, it is contemplated that preferred implementations will incoφorate some form of microprocessor- controlled display such as that found in conventional desktop computers or microprocessor- based terminals.
Total-Image Signal The live total-image signal received from path 10 represents a moving object and its background. In many implementations, this signal is a conventional video signal such as that which can be obtained from a video camera or a television transmission. Typically, the location and/or appearance ofthe object is changing such that the image of the object represented in the total-image signal is also changing.
Control Signal The control signal received from path 20 indicates what portion ofthe total-image signal corresponds to the image of only the object. In practical implementations it is usually difficult if not impossible to identify the portion ofthe signal that corresponds exactly to the object image. Some error is encountered at the edges. References herein to the image of "only the object" and the like should be understood to mean the image ofthe object with allowance for some error at the edges ofthe image. No particular method for obtaining the control signal is critical to the practice of the present invention. For example, techniques such as those used in the various forms of chroma- keying, luminance keying and video difference keying may be used. The choice of technique will be influenced by the nature of the object and particularly by the nature of the background behind the object. If the background is monochromatic and illuminated more or less uniformly, a form of chroma-key switching may be used. These techniques are attractive in that they can be implemented inexpensively, however, the quality ofthe resulting images are degraded by obtrusive edge effects.
Techniques used in chroma-key linear compositing, such as those as mentioned above, may also be used. These techniques avoid many ofthe edge effects caused by various forms of chroma-key switching and they are also able to manage greater variations in color and lighting ofthe foreground. Techniques such as video difference keying do not require a background that is essentially monochromatic and uniformly illuminated. Backgrounds with such characteristics are not commonplace; therefore, difference keying is preferred in many applications.
According to US 4,800,432, mentioned above, a reference image for difference keying may be stored from an input video signal or it may be synthesized under operator control. A reference image may also be constructed over time by comparing frames ofthe total-image signal across some interval of time, identifying portions ofthe image that do not change, and using those portions to construct the reference image. For example, portions of a background behind a person could be quickly constructed from areas not obscured by the person, and other portions ofthe background could be constructed as the person moved to reveal those portions.
Fig. 2a illustrates one example of a component that can be used to generate a control signal using image difference keying. Init 202 initializes reference image store 204 through switch 203 along path 14 from either the total-image signal received from path 10 or from some other signal received from path 12. If reference image store 204 is implemented by a delay line or other storage technique requiring periodic refresh, the stored image is recirculated from store output path 16 along path 18 to store input path 14; otherwise, path 18 is not required. Comparator 206 generates along path 20 a difference signal obtained from the difference between the total-image signal and a reference image signal received from path 16.
The results achieved by the embodiment illustrated in Fig. 2a are not satisfactory if the background portion ofthe total-image signal changes. Slowly occurring changes such as changes in outside light level, moving shadows, or movement ofthe hour and minute hands of a clock can be accommodated by adapting the reference image to follow these changes in the background. Fig. 2b illustrates a variation ofthe component shown in Fig. 2a that can adapt to slowly occurring changes. According to this variation, amplifier 212 applies a gain g to the difference signal received through switch 210 from path 20, where 0 < g < 1, and passes the amplified signal to summing circuit 208. Summing circuit 208 generates a sum signal along path 18 by combining the reference image signal received from path 16 with the amplified signal received from amplifier 212.
Switch 210 is controlled by a signal generated along path 28 by motion-detection circuit 220. Difference circuit 224 generates for each pixel a signal representing the absolute value of the difference between the total-image signal received from path 10 and a delayed version of the total-image signal received through frame store 222. Switch 228 is controlled by a signal received from threshold 226. When the absolute value ofthe difference from circuit 224 exceeds a level defined by threshold 226, switch 228 is caused to connect to path 22 which conveys a signal representing the value one; otherwise, switch 228 is caused to connect to path 24 which allows the output of frame store 232 to be fed back to the input of frame store 232 through amplifier 230. The gain a of amplifier 230, the value of which is constrained to 0 < a < 1, causes the signal in feedback path 24 to decay while switch 228 is connected to path 24. Threshold 234 generates a signal along path 28 in response to the signal received from path 26. When the signal received from path 26 exceeds the level defined by threshold 234, the signal generated along path 28 indicates motion is present for a respective pixel; otherwise, the signal generated along path 28 indicates there is no motion. When, for a respective pixel, a sufficient difference between two adjacent frames is detected by difference circuit 224 and threshold 226, switch 228 connects to path 22 causing a value of one to be fed to frame store 232 and to be passed along path 26 to threshold 234. In response thereto, threshold 234 generates a signal along path 28 indicating motion is present for that respective pixel. When, for that respective pixel, the difference between adjacent frames falls below the level of threshold 226, switch 228 connects to path 24 and allows the signal stored in frame store 232, which corresponds to that respective pixel, to decay from one to zero In preferred embodiments, the gain a of amplifier 230 is set so that the signal level drops below the level defined by threshold 234 after about 10 to 20 seconds. If a sufficient difference between adjacent frames for that respective pixel is detected before the signal on path 26 falls below the level defined by threshold 234, the signal is immediately set to a value of one, as explained above. As a result, the signal generated along path 28, will indicate continuous movement for a respective pixel if differences for that respective pixel exceed the level of threshold 226 at least once during the decay period established by amplifier 230, for example, once every 10 to 20 seconds. Whenever the signal along path 28 indicates that motion is present for a respective pixel, switch 210 is caused to open, thereby causing the amplified signal from amplifier 212 to go to zero. The sum signal along path 18 for the respective pixel in reference image store 204 represents the current value of that pixel; that is, the respective pixel in reference image store 204 is not altered. When no motion for the respective pixel is detected over the decay period mentioned above, motion-detection circuit 220 causes switch 210 to close, thereby causing the respective pixel in reference image store 204 to be adapted by the amplified signal. In the example shown in Fig. 2b, each respective pixel is adapted according to
Figure imgf000010_0001
where p(i) = adapted pixel p in the reference image,
/>(;'- 1) = previous pixel/? in the reference image, g = gain of amplifier 212, and x(i) = corresponding pixel in the total-image signal received from path 10.
The index /' represents time in general but more particularly, in preferred embodiments, represents a particular video frame; thus, p(i) represents a respective pixel in one video frame and/?(/-l) represents that respective pixel in the immediately previous video frame.
Object-Image Signal A variety of techniques may be used to generate image information and decoding information portions ofthe object-image signal. The nature of the control signal will often influence and may even dictate the technique for generating the image information portion of the object-image signal.
If the control signal is binary, such as that obtained by chroma-key switching, 1 an image-only signal may be generated by extracting intervals ofthe total-image signal corresponding to intervals ofthe control signal which are off; i.e., not background.
If the control signal is linear, such as that obtained in systems used for chroma key linear-compositing or video difference keying, the image-only signal may be generated by multiplying the total-image signal by the control signal, where the control signal is substantially zero for portions of the total-object signal corresponding to the background and is substantially one for portions corresponding to the object. Alternatively, the object-only signal may be generated by extracting portions ofthe total-image signal for portions where the control signal exceeds a threshold.
The image information portion of the object-image signal may be obtained by encoding the object-only signal according to some scheme such as a form of run-length-limited (RLL) coding in which a code represents the interval between one or more references and the edges of the object image. Preferably, the encoding process accompanies an adaptive clipping ofthe total-image signal so as to stabilize the position and size of the moving object.
The encoding process can include some form of data or bandwidth compression such as, for example, processes conforming to various MPEG and JPEG standards, or processes which exploit redundancy between video frames. Either lossy and/or lossless compression may be used. Visual-Image Signal
A visual-image signal representing only the image ofthe moving object may be obtained by receiving the object-image signal from path 31, extracting the image information and the decoding information from the object-image signal, and decoding the image information according to the decoding information. For example, if the image information was processed by some form of data or bandwidth compression, the visual-image signal is derived using a complementary expansion. The decoding process may also include an adaptive clipping ofthe derived visual-image signal so as to stabilize the position and size ofthe moving object.
Adaptive clipping may be performed with the decoding process in addition to or instead of any adaptive clipping performed with the encoding process.
Display Presentation Figs. 3a and 3b each illustrate a viewing area 500 on a computer display that presents a horizontal menu 502 across the top as part of a metaphoric representation of a desktop. In each figure a portion of a memo is presented in a rectangular area with a frame known as a "window." This first window 510 overlaps a second window 520 containing a portion of an electronic spreadsheet and a bar graph. In addition to these visual elements, an image 532 of a person is shown in the lower right-hand corner.
Although the present invention may be incorporated into systems which do not include a computer, preferred implementations do include a computer-controlled display that presents one or more visual elements as well as the image of an object. In this context, a "visual element" either presents results of operation by the computer or it provides facilities by which operation ofthe computer can be affected. Menu 502, the desktop, and windows 510 and 520 with scroll bars, menus and display areas are examples of visual elements.
Fig. 4 illustrates a schematic block diagram of a computer-based display system. Display processor 402 generates a presentation on display 404 in response to the visual-image signal received from path 31 and signals received from path 40 representing one or more visual elements. In effect, display processor 402 synchronizes and merges the signals received from paths 31 and 40, which can be accomplished using techniques well known in the art Typically, the signals representing visual elements are generated by software executed by a processor such as processor 408. Operation of processor 408 may be controlled by operator request received through input 406 which represents devices such as a keyboard and/or a pointing device such as a mouse or trackball. No particular computer, operating system or software environment is critical to the practice ofthe present invention. Examples of computers that can present the visual-image signal and one or more other visual elements include IBM® PC-compatible computers, Macintosh® computers and Sun® workstations that each comprise a system unit and a display for presenting visual images under control ofthe system unit. Examples of operating systems and software environments include those offered under the trademarks "Windows," "OS/2," "OS/2 Presentation Manager," "OSF/Motif," "DESQview," "GEOS," and "Macintosh Finder " The products provide various forms of a user interface that employs display ports called windows It should be appreciated that the present invention is not limited to any particular user interface For ease of discussion, however, the following discussion is more particularly directed to a so called graphical user interface (GUI) similar to that shown in the figures
Fig 3a illustrates a screen display of a conventional system in which image 532 of a person and a background are presented in a third window 530 Although image 532 itself does not overlap windows 510 and 520, the frame of window 530 and the portion presenting the background do overlap and obscure some ofthe information within windows 510 and 520 In the example shown, the overlap could be avoided by merely reducing the size of window 530, assuming that the system implementation permits the size of window 530 to be changed In this example, however, overlap would be avoided only after reducing the size of window 530 to a point where image 532 would be very small and difficult to see clearly In a system incorporating features ofthe present invention, the image of a moving object may be presented on a computer display without its background because the visual-image signal received from path 31 represents the image of only the object. Referring to Fig 3b, only image 532 is presented As a result, image 532 in this display, which has the same relative size and location as image 532 illustrated in Fig 3a, does not overlap and obscure information within either window 510 or window 520 Screen display 500 shown in Fig 3b presents image 532 with a large size so that it can be seen clearly yet not obscure other information
A clock appears behind image 532 in the screen display shown in Fig 3a but the image ofthis clock does not appear in the screen display shown in Fig 3b This may be accomplished by excluding the clock image from the object-image signal In embodiments using difference keying to generate the control signal, for example, the clock image may be excluded by adapting the reference image to slowing occurring changes in the total image, as described above
The amount of information required to represent the image of only the moving object is usually considerably less than the amount of information required to represent the object and its background. As a result, the object-image signal can be conveyed using less bandwidth or stored using less space that would be required to convey or store the total-image signal.
In preferred embodiments, an operator is able to manipulate the displayed image in a variety of ways including moving, resizing, hiding, maximizing and minimizing. Such operations may be performed, for example, by a combination of actions using a keyboard and/or a pointing device. The term "moving" refers to changing the location ofthe image within the screen display. The term "resizing" refers to changing the displayed dimensions ofthe image. The term "hiding" refers to obscuring the image with one or more other visual elements The term "maximizing" refers to enlarging the size ofthe image so that it fills the display screen; preferably, all other visual elements in the display are obscured by a user-selectable background when an image is maximized. The term "minimizing" refers to replacing the image with a small visual element so that the image is effectively removed from the screen. The small visual element may have a static appearance or it may include a dynamic appearance such as a very small presentation ofthe image. Preferably, some visually distinctive feature indicates when an object image is selected for moving or resizing. For example, a frame and/or opaque background could be added to the image presentation. The frame could be a conventional frame such as that shown for window 530 or it could be a distinctive frame presented, for example, as a rectangle formed by faint or broken lines. One way in which such a presentation may be accomplished is to display the moving object image in a frameless window with a transparent background. The frameless window can be manipulated in a manner similar to that for more conventional opaque windows with frames. Alternatively, the cursor of a pointing device could change in appearance depending on whether the cursor points inside the window, at the edge ofthe window, or outside the window. As a familiar example, the cursor could be a single-headed arrow when outside the window, a double-headed arrow when at the edge ofthe window indicating that the window can be resized, and a four-headed arrow when inside the window indicating that the window can be moved.
Efforts to standardize the system-operator interface across a wide range of products have contributed to the familiarity of those skilled in the art with the implementation of products having such interfaces. Applications executing in environments such as those mentioned above generally conform to a desktop metaphor with icons and various system processes which provide for selecting and interacting with various visual elements. Features such as window scroll bars, window moving and resizing, and dialog boxes are commonplace. Details of implementation compatible with the products mentioned above may be obtained from references such as "The Windows Interface: An Application Design Guide" for Microsoft Windows 3. 1, 1992, Microsoft Coφ.; Petzold, "Programming Windows", 2nd ed., 1990, Microsoft Press, "Macintosh User's Guide for Desktop Macintosh Computers, 1992, Apple Computer, Inc., and "GEOS System Software Overview," 2nd ed., 1993, Geoworks Inc., "Inside Macintosh" developer reference manuals, Apple Computer, Inc., and "GEOS System Software Overview," 2nd ed., 1993, Geoworks Inc., all of which are incoφorated by reference in their entirety The details of implementation may utilize features which are generic to a wide variety of environments; however, these details should be understood to represent only examples of how a system incoφorating the present invention may be implemented.

Claims

1. A method for making a visual presentation in a computer display comprising: receiving at a physically-remote location a live total-image signal representing an image of a moving object in front of a background, receiving at said physically-remote location a control signal indicating what portion of said total-image signal corresponds to the image of only said object, generating at said physically-remote location an object-image signal in response to said total-image signal and said control signal such that said object-image signal includes image information representing the image of only said object and includes decoding information needed to decode said image information and reconstruct the image of said object on a display, transmitting at said physically-remote location said object-image signal, receiving said object-image signal and decoding said image information according to said decoding information so as to generate a visual-image signal representing the image of said moving object, and presenting in said computer display one or more visual elements and, in response to said visual-image signal, a representation of said moving object such that the image of said moving object without background appears in front of said visual elements, wherein said one or more visual elements either present results of operation by said computer or provide facilities by which operation of said computer may be affected by requests from an operator of said computer.
2. A method according to claim 1 wherein generating said object-image signal comprises adaptively clipping said total-image signal so as to stabilize position and size of the image of said moving object.
3. A method according to claim 1 which further comprises: initializing a reference image and adapting at least a part of said reference image in response to portions of said total-image signal corresponding to said background, comparing said total-image signal with said reference image and generating a difference signal, and generating said control signal in response to said difference signal. u
4. A method according to claim 1 which further comprises: initializing a reference image, analyzing said total-image signal so as to determine first portions of said total- image signal in which motion has occurred and to determine second portions of said total-image signal in which motion has not occurred, adapting portions of said reference image corresponding to and in response to said second portions of said total-image signal, comparing said total-image signal with said reference image and generating a difference signal, and generating said control signal in response to said difference signal.
5. A method according to claim 1 which further comprises moving or resizing said representation under control of said operator.
6. A method according to claim 1 wherein presenting in a computer display said representation of said moving object comprises adaptively clipping said visual-image signal so as to stabilize position and size ofthe image of said moving object.
7. A system for making a visual presentation in a computer display comprising: input terminal coupled to a live video signal source, a first switch having a first input coupled to said input terminal and having a second input, a first frame store having an input coupled to an output of said first switch and having an output coupled to said second input of said first switch, a first difference circuit having an input coupled to said input terminal and having another input coupled to said output of said first frame store, a video processor having a video-signal input coupled to said input terminal and having a key-signal input coupled to an output of said first difference circuit, a transmission path coupled to an output terminal of said video processor, a computer physically remote from said video processor and comprising a video interface coupled to said transmission path, an operator input device, a processor responsive to said video interface and said input device, and said video display responsive to said processor, wherein said processor causes said video display to present an image in response to signals received by said video interface and to present one or more visual elements.
8. A system according to claim 7 further comprising: a second switch having an input coupled to said output of said first difference circuit and having a switch-control input, a first amplifier having an input coupled to an output of said second switch, a sum circuit having an input coupled to said output of said first frame store, having another input coupled to an output of said first amplifier, and having an output coupled to said second input of said first switch, a second frame store having an input coupled to said input terminal, a second difference circuit having an input coupled to said input terminal and having another input coupled to an output of said second frame store, a first threshold circuit having an input coupled to an output of said second difference circuit, a third switch having a first input coupled to a reference signal source, having a second input, and having a switch-control input coupled to an output of said first threshold circuit, a third frame store having an input coupled to an output of said third switch, a second amplifier having an input coupled to an output of said third frame store and having an output coupled to said second input of said third switch, and a second threshold circuit having an input coupled to said output of said third switch and having an output coupled to said switch-control input of said second switch.
9. A system for making a visual presentation in a computer display comprising: means at a physically-remote location for receiving a live total-image signal representing an image of an object in front of a background, means at said physically-remote location for receiving a control signal indicating what portion of said total-image signal corresponds to the image of only said object, means at said physically-remote location for generating an object-image signal in response to said total-image signal and said control signal such that said object-image signal includes image information representing the image of only said object and includes decoding information needed to decode said image information and reconstruct the image of said object on a display, means i at said physically-remote location for transmitting said object-image signal, means for receiving from said physically-remote location said object-image signal and decoding said image information according to said decoding information so as to generate a visual-image signal representing the image of said moving object, and means for presenting in said computer display one or more visual elements and, in response to said visual-image signal, a representation of said moving object such that the image of said moving object without background appears in front of said visual elements, wherein said one or more visual elements either present results of operation by said computer or provide facilities by which operation of said computer may be affected by requests from an operator of said computer.
10. A system according to claim 9 wherein said means for generating said object-image signal comprises means for adaptively clipping said total-image signal so as to stabilize position and size ofthe image of said moving object.
I I . A system according to claim 9 which further comprises: means for initializing a reference image and adapting at least a part of said reference image in response to portions of said total-image signal corresponding to said background, means for comparing said total-image signal with said reference image and generating a difference signal, and means for generating said control signal in response to said difference signal.
12. A system according to claim 9 which further comprises: means for initializing a reference image, means for analyzing said total-image signal so as to determine first portions of said total-image signal in which motion has occurred and to determine second portions of said total-image signal in which motion has not occurred, means for adapting portions of said reference image corresponding to and in response to said second portions of said total-image signal, means for comparing said total-image signal with said reference image and generating a difference signal, and means for generating said control signal in response to said difference signal.
13. A system according to claim 9 further comprising means for moving or resizing said representation under control of said operator.
14. A system according to claim 9 wherein said means for presenting in a computer display said representation of said moving object comprises means for adaptively clipping said visual-image signal so as to stabilize position and size ofthe image of said moving object.
PCT/US1997/006352 1996-04-18 1997-04-17 Method and apparatus for presenting in a computer display images representing remote moving objects WO1997039581A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US63419696A 1996-04-18 1996-04-18
US08/634,196 1996-04-18

Publications (1)

Publication Number Publication Date
WO1997039581A1 true WO1997039581A1 (en) 1997-10-23

Family

ID=24542794

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1997/006352 WO1997039581A1 (en) 1996-04-18 1997-04-17 Method and apparatus for presenting in a computer display images representing remote moving objects

Country Status (1)

Country Link
WO (1) WO1997039581A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2352917A (en) * 1999-05-12 2001-02-07 Nec Corp Television conference system with user-defined background image or setting.

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0264965A2 (en) * 1986-10-24 1988-04-27 The Grass Valley Group, Inc. Video difference key generator
WO1996009722A1 (en) * 1994-09-19 1996-03-28 Teleport Corporation Teleconferencing method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0264965A2 (en) * 1986-10-24 1988-04-27 The Grass Valley Group, Inc. Video difference key generator
US4800432A (en) * 1986-10-24 1989-01-24 The Grass Valley Group, Inc. Video Difference key generator
WO1996009722A1 (en) * 1994-09-19 1996-03-28 Teleport Corporation Teleconferencing method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CONF. PROCEEDINGS IEEE: "watabe et al: "a distributed multiparty desktop conferencing system and its architecture", 21 March 1990, 9TH ANNUAL INT. CONF. ON COMPUTERS AND COMMUNICATIONS, SCOTTSDALE (US), XP000140027 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2352917A (en) * 1999-05-12 2001-02-07 Nec Corp Television conference system with user-defined background image or setting.
US6529231B1 (en) 1999-05-12 2003-03-04 Nec Corporation Television meeting system
GB2352917B (en) * 1999-05-12 2003-07-23 Nec Corp Television meeting system

Similar Documents

Publication Publication Date Title
US6205260B1 (en) Sprite-based video coding system with automatic segmentation integrated into coding and sprite building processes
US6646655B1 (en) Extracting a time-sequence of slides from video
US6278466B1 (en) Creating animation from a video
US6268864B1 (en) Linking a video and an animation
Tonomura et al. Videomap and videospaceicon: Tools for anatomizing video content
EP2109313B1 (en) Television receiver and method
US6670994B2 (en) Method and apparatus for display of interlaced images on non-interlaced display
US7187415B2 (en) System for detecting aspect ratio and method thereof
US20080101456A1 (en) Method for insertion and overlay of media content upon an underlying visual media
US7751683B1 (en) Scene change marking for thumbnail extraction
CN1295287A (en) Method and system for editing operating image data and providing target imformation acording to the data
US7876996B1 (en) Method and system for time-shifting video
EP1097568A2 (en) Creating animation from a video
US20040008198A1 (en) Three-dimensional output system
JP4229702B2 (en) Local improvement of display information
WO1997039581A1 (en) Method and apparatus for presenting in a computer display images representing remote moving objects
US20020083091A1 (en) Seamless integration of video on a background object
US20020163501A1 (en) Method and device for video scene composition including graphic elements
US5990959A (en) Method, system and product for direct rendering of video images to a video data stream
WO1996041469A1 (en) Systems using motion detection, interpolation, and cross-dissolving for improving picture quality
US20050259750A1 (en) Method and encoder for encoding a digital video signal
EP1338149A1 (en) Method and device for video scene composition from varied data
CN100346642C (en) Method and device for displaying specific pcitures
KR100477632B1 (en) Image signal processing apparatus in computer
CN117241089A (en) Video processing method, electronic device, and readable storage medium

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: JP

Ref document number: 97537353

Format of ref document f/p: F

122 Ep: pct application non-entry in european phase