EP0832537A1 - Verfahren und vorrichtung zum einfügen von dynamischen und statischen bildern in einer echtzeitvideoübertragung - Google Patents

Verfahren und vorrichtung zum einfügen von dynamischen und statischen bildern in einer echtzeitvideoübertragung

Info

Publication number
EP0832537A1
EP0832537A1 EP96921559A EP96921559A EP0832537A1 EP 0832537 A1 EP0832537 A1 EP 0832537A1 EP 96921559 A EP96921559 A EP 96921559A EP 96921559 A EP96921559 A EP 96921559A EP 0832537 A1 EP0832537 A1 EP 0832537A1
Authority
EP
European Patent Office
Prior art keywords
image
landmarks
scene
current
insertable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP96921559A
Other languages
English (en)
French (fr)
Other versions
EP0832537A4 (de
Inventor
Darrell S. Di Cicco
Karl Fant
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Princeton Video Image Inc
Original Assignee
Princeton Video Image Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US08/563,598 external-priority patent/US5892554A/en
Application filed by Princeton Video Image Inc filed Critical Princeton Video Image Inc
Publication of EP0832537A1 publication Critical patent/EP0832537A1/de
Publication of EP0832537A4 publication Critical patent/EP0832537A4/de
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/272Means for inserting a foreground image in a background image, i.e. inlay, outlay
    • H04N5/2723Insertion of virtual advertisement; Replacing advertisements physical present in the scene by virtual advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/272Means for inserting a foreground image in a background image, i.e. inlay, outlay

Definitions

  • a system and method facilitates the insertion of dynamic and static images and other indicia into live broadcast video images on a real time basis so that they appear to be part ofthe original broadcast.
  • the present invention represents a significant improvement over various prior art approaches to the problem of inserting images into a live video broadcast.
  • the prior art techniques suffer from the inability to rapidly detect and track landmarks and insert a dynamic or static image into a live video broadcast in a realistic manner.
  • many prior art techniques are computationally intense and require cumbersome and complicated computer systems to achieve their goals.
  • sensor means such as gyro compasses, potentiometers, inertial navigation instruments, and inclinometers are used to generate information regarding camera tilt angles, aperture angles, and the like.
  • X and Y encoders in the context of a video insertion system has also been described, among other places, in Patent Abstracts of Japan, "Picture Synthesizer,” Vol. 15, No. 8 (E-1042) 8 March, 1991 and JP-A-02 306782 (Asutoro Design K.K.) 20 December, 1990. It is also believed that the use of X and Y sensors has previously been used in Europe to assist in the placement of inserts into live video broadcasts.
  • Patent 5,264,933 discusses, in detail, a method for placing a logo or other indicia into, for example, a tennis court during a live broadcast.
  • a target zone is pre-selected for receiving insertable images into the broadcast image.
  • the target zone is spatially related to certain landmarks that represent distinguishable characteristics ofthe background scene being captured by the camera.
  • the system always looks for landmarks in the target zone but the patent also discloses the fact that landmarks outside ofthe target zone can be employed too.
  • Landmarks identified by the processor during broadcast are compared against a reference set of landmarks identified in a reference image. When sufficient verification has occurred, the operator inserts an image into the pre-selected target zone ofthe broadcast image.
  • the target zone could be the space between the uprights of a goalpost.
  • the target zone could be a portion ofthe wall behind home plate.
  • the Burt Pyramid technique described above and known in the prior art involves the reduction of an image into decimated, low resolution, versions which permit the rapid location and identification of prominent features, generally referred to as landmarks.
  • the Burt Pyramid is one of several well known, prior art, techniques that can be employed to identify landmark features in an image for the purpose of replacing a portion ofthe image with an insert in the context of a live video broadcast.
  • Luquet et al., U.S. Patent 5,353,392 discloses a system that is limited to modifying the same zone, referred to as a target zone, in successive images.
  • a target zone a zone that is limited to modifying the same zone, referred to as a target zone, in successive images.
  • Luquet '392 suffers from some ofthe same drawbacks as Rosser '933, namely, that the inserted image is tied to a fixed location, or target zone, within the overall image.
  • the present invention as discussed in the "Detailed Description of the Preferred Embodiment" later in this disclosure, is capable of inserting an image virtually anywhere within the overall broadcast scene independent ofthe identification of a specific insertion or target zone.
  • Zoom correction and occlusion processing are discussed in PCT application PCT/US94/11527 assigned to ORAD, Inc. According to that system sensors are placed on the periphery ofthe camera zoom lens. The sensors mechanically detect the rotation ofthe zoom lens and calculate a corresponding zoom factor. The zoom factor is then fed to a computer system to correct the size ofthe intended insert. Systems of this type suffer from mechanical drawbacks such as jitter which may introduce an error factor rendering the size of an insertable image unacceptably variable.
  • the present invention overcomes such mechanical drawbacks by determining the changed positions of landmarks within the current image and automatically applying a corresponding zoom factor to the insertable image. The present invention relies on landmark positions within the current image and not on external factors subject to motion or jitter. Thus, any sudden, unwanted camera motion or lens movement will not affect the zoom adjustment calculations.
  • target area may be important. The situation becomes much more difficult if it is desired to place a static image someplace other than in the "target zone” or to insert a dynamic image, i.e., one that can move, into a live video scene.
  • the insertable image may be dynamic either in the sense that the image moves across the scene or the image itself changes from frame to frame, or both. Imagine, for example, the difficulties of superimposing a rabbit, beating a drum, simultaneously moving across the field of view into a live video broadcast.
  • the invention comprises a system and method for inserting static and dynamic images into a live video broadcast in a realistic fashion on a real time basis.
  • the operator ofthe system selects certain natural landmarks in a scene that are suitable for subsequent detection and tracking.
  • landmarks may be selected by the electronic system, then approved by the operator or not. It is important that the natural landmarks survive decimation, i.e., remain recognizable, during the recognition step which may be Burt Pyramid analysis.
  • Landmarks preferably comprise sharp, bold and clear vertical, horizontal, diagonal or corner features within the scene visible to the video camera as it pans and zooms.
  • at least three or more natural landmarks are selected.
  • the landmarks are distributed throughout the entire scene, such as a baseball park or a football stadium, and that the field of view ofthe camera at any instant is normally significantly smaller than the full scene that may be panned.
  • the landmarks are often located outside ofthe destination point or area where the insert will be placed because the insert area is typically too small to include numerous identifiable landmarks and the insertable image may be a dynamic one and, therefore, it has no single, stationary target destination.
  • the system models the recognizable natural landmarks on a deformable two-dimensional grid.
  • An arbitrary, non-landmark, reference point is chosen within the scene.
  • the reference point is mathematically associated with the natural landmarks and is subsequently used to locate the insertion area.
  • a point on the insert located, for example, at either the lower left or upper right hand corner ofthe insert, such as the case where the insert might be b in the shape of a square or rectangle may be aligned with the reference point.
  • the insert may be aligned at any fixed distance from the reference point. Ifthe insert is dynamic, then the point is used as an origin to drive the dynamic image throughout the field of view.
  • the location of the dynamic image changes from frame to frame as the distance of the dynamic image incrementally changes with respect to the reference point.
  • the reference point may be located out ofthe field of view of the camera.
  • the reference point may be any point on the grid including the origin.
  • the system operator Prior to the live video insertion process, the system operator prepares artwork ofthe image to be inserted and adjusts it for perspective, i.e., shape. Because the system knows the mathematical relationship between the landmarks in the scene, it can automatically determine the zoom factor and X, Y position adjustment that must be applied to the insertable image just prior to insertion. Thereafter, when the camera zooms in and out and changes its field of view as it pans, the insertable image remains properly scaled and proportioned with respect to the other features in the field of view so that it looks natural to the home viewer. As long as the field of view includes at least three suitable landmarks, the system can always establish where it is with respect to the reference point.
  • the operator can make further minor adjustments such as moving the insertable image up and down with a pair of X and Y buttons and/or adjust it for size (i.e., zoom) to better fit the background against which it is located. Such adjustments may take place prior to or during the event.
  • the system can easily place an insertable image at or in any location. Because the system does not require substantial amounts of computational power, as is required by most other systems, the insertable image appears much more quickly and naturally in a given scene and does not "pop up" as is the case with other prior art systems.
  • FIG. 1 illustrates a view of a baseball stadium during a live video broadcast showing a camera's current field of view identified within the stadium view.
  • FIG. 2 illustrates a view of a baseball stadium prior to a broadcast showing a camera's current field of view identified within the stadium view.
  • FIG. 3 illustrates a live shot ofthe camera's field of view overlayed with landmarks and an X, Y grid.
  • FIG. 4 is a block diagram showing the major components ofthe video insertion system according to the preferred embodiment ofthe invention.
  • FIG. 5 is a detailed schematic diagram ofthe video insertion system according to the preferred embodiment ofthe invention.
  • FIG. 6 is a block diagram ofthe Landmark Tracking Board (LTB).
  • FIG. 7 A is a mathematical landmark model of Fig. 2.
  • FIG. 7B illustrates a detailed portion of Fig. 7A.
  • FIGS. 8 A-D illustrate four levels of decimation ofthe detail shown in Fig. 2 where the resolution is halved in each level.
  • FIGS. 9 A-D illustrate four levels of decimation ofthe current image with the landmark model superimposed thereon and where the resolution is halved in each level.
  • FIG. 10A illustrates a tennis court showing one potential position in which an insert can be placed during a broadcast.
  • FIG. 10B is the initial artwork of a logo to be inserted into the image ofthe tennis court shown in Fig. 10A.
  • FIG. IOC illustrates a warped representation of Fig. 10A adjusted for the perspective of the tennis court during broadcast.
  • FIG. IOD illustrates a warped representation of Fig. IOC adjusted for a magnification zoom.
  • FIG. 10E illustrates a warped representation of Fig. IOC adjusted for a shrink zoom.
  • FIG. 11 illustrates the current image with the stationary insert placed in a location without occlusion.
  • FIG. 12 illustrates the current image with the stationary insert placed in a location with occlusion.
  • FIG. 13 illustrates the current image with the stationary insert placed in a location without occlusion but adjusted for magnification zoom.
  • FIG. 14 illustrates the current image with the stationary insert placed in a location compensated for occlusion and adjusted for a magnification zoom.
  • FIG. 15 illustrates a dynamic insertion of a logo showing the logo moving left to right.
  • FIG. 16 is a schematic representation of he hardware implementation ofthe system and method of the preferred embodiment ofthe invention.
  • an insertable image is capable of being inserted at any location within the current image without requiring identification of an existing advertisement or a "target zone" area. Rather, a mathematical landmark model and related coordinate system imposed thereon are used to permit the system operator to pinpoint the location of an insertion virtually anywhere within the current image.
  • Fig. 1 illustrates a stadium view 10 of a typical scene during a baseball game.
  • a standard television camera 12 is shown with the current field of view 14 highlighted.
  • the pan range of camera 12 may include most ofthe stadium.
  • the remaining area 18 is outside ofthe camera's current field of view and comprises the rest ofthe stadium view not in the pan range.
  • pitcher 20a is shown delivering a pitch to catcher 20c.
  • a batter 20b stands poised to hit baseball 24 while umpire 20d observes the action.
  • the present invention will place an advertisement or a commercial logo on the wall behind home plate during the broadcast ofthe game.
  • the invention In order to insert an image into a live broadcast, the invention must be able to recognize the current image so that it can properly place an insert.
  • the invention employs a landmark mapping scheme wherein prominent features ofthe scene have been predefined as landmarks.
  • Landmarks are not determined as a function of the position of the insertion region and are preferably not within the insertion region.
  • the landmarks are not unique to a particular insertion region. Rather, as dictated by the features of the reference image, the landmarks are spread like a constellation or tapestry throughout the reference image.
  • the same set of landmarks is capable of locating numerous different insertion regions within the reference image. Recognition ofthe insertion region, sometimes referred to in the prior art as the "target zone", is, therefore, unnecessary.
  • Landmark types generally comprise horizontal, vertical, diagonal, and corner features of a scene.
  • the vertical seams of backboards 26 comprise vertical features 28 while the top and bottom horizontal edges of backboards 26 comprise horizontal features 30.
  • Corner features 32 are defined at points where vertical features 28 and horizontal features 30 intersect. However, the whole region of panning, even outside the current field of view, contains features.
  • Fig. 2 Before an insertable image can be inserted into a live broadcast, the invention must have information regarding the location and types of landmarks. This is achieved by creating a reference image ofthe stadium in which landmarks are placed according to the prominent features of a given scene.
  • a preliminary stadium view 40 of an empty stadium is shown.
  • Camera 12 portrays the empty stadium 40 from the same perspective as in Fig. 1.
  • Backboards 26 are shown with vertical landmarks 42, horizontal landmarks 44 and corner landmarks 46.
  • Fig. 2 comprises a portion of a pictorial representation of the reference array 48.
  • the reference array 48 has been depicted pictorially, in reality it is nothing more than a data table of landmark locations and types which encompass the entire scene to be panned.
  • the reference array 48 After the reference array 48 is obtained, its pictorial representation is analyzed. The analysis is premised on the use ofthe Burt Pyramid algorithm which can decimate the reference image into as many as four levels (e.g., levels 0-3), each level having decreased resolution by one half. Referring now to Figs. 8 A-D, four levels of decimation are shown with varying degrees of resolution.
  • the level 0 image 144 has the highest resolution at 240 x 720 pixels.
  • the level 1 image 146 has half the resolution of level 0, namely 120 x 360 pixels.
  • the level 2 image 148 has half the resolution ofthe level 1 image 146, 60 x 180 pixels.
  • level 3 image 150 the lowest level, identifies relatively coarse features of landmarks 42, 44, 46 that survive to a resolution of 30 x 90 pixels. For each level, only the resolution changes. The size and the scale ofthe reference image 48 does not change for the different levels.
  • the position of various landmarks 42, 44, 46 are determined within the pictorial representation ofthe reference array 48 at all levels by a light pen, a trackball locator, or other similar means.
  • Prominent features in the pictorial representation ofthe reference array such as the goal posts in a football stadium or a physical structure present in the pictorial representation ofthe reference array 48, i.e. the wall behind home plate, are used by the system operator as the landmarks at each level. Landmarks are often selected such that they will survive decimation and remain recognizable to at least level 3.
  • Each landmark is assigned an X,Y coordinate location.
  • each landmark is assigned a type, e.g., vertical, horizontal, diagonal, or corner.
  • the location for each landmark 42, 44, 46 is then stored in a computer system and this stored data set is the reference array itself.
  • a reference location 49 of an insertion region is selected by an operator using a light pen, a trackball locator, or other similar means.
  • the operator selects a single X,Y coordinate location as the reference location 49 ofthe insertion region.
  • This reference location 49 could correspond to the center point, top-right location ofthe intended insert, or any other suitable point of the insertable image.
  • the reference location 49 is then stored in the computer system and is used to facilitate insertion of an insertable image as a function ofthe mathematical relationship ofthe reference location and the landmark locations.
  • the reference location may simply be the origin ofthe X,Y coordinate system and the insertable image is then inserted at an X,Y position chosen by the operator.
  • the reference location 49 of an insertion is that location in the reference array 48 which defines the position to which the insertable image will be related. Selecting the landmarks 42, 44, 46 and the reference location 49 is done prior to the real-time insertion process. Further, the initial reference location can be changed by the system operator during the broadcast.
  • the insertable image is placed in the pictorial representation ofthe reference array 48 at the selected reference location 49. Next it is warped so that the pattern size and shape, i.e. perspective, is adjusted at the reference location 49 so that it fits snugly within the intended insertion area. The adjusted insertable image is then stored for use in the real-time insertion process.
  • Figs. 10A through 10E Preparing a logo for insertion into a broadcast is illustrated in Figs. 10A through 10E.
  • Fig. 10A an empty tennis court 160 is shown as the reference image. Within the court, an intended area of insertion 162 is shown.
  • a tennis court has a rectangular shape, when viewed through a camera from a far end the court appears on video to have a slightly trapezoidal shape. Therefore, it is crucial to have the inserted logo reflect the slight trapezoidal nature ofthe image.
  • Fig. 10B shows the artwork of a logo 164 in its original form.
  • Fig. IOC shows a warped form ofthe logo 166 after it has been adjusted for its trapezoidal appearance due to the camera's point of view.
  • Figs. 10A an empty tennis court 160 is shown as the reference image. Within the court, an intended area of insertion 162 is shown.
  • Fig. 10B shows the artwork of a logo 164 in its original form.
  • Fig. IOC shows
  • IOD and 10E each show the warped logo after being adjusted for a magnification or zoom factor.
  • the logos 168, 170, shown in Figs. IOD and 10E respectively, are warped for magnification or zoom only. This zoom warping occurs during the broadcast just prior to insertion as opposed to shape warping which occurs prior to the broadcast.
  • Fig. 3 is a superimposed or overlaid view 50 ofthe current field of view 14 of camera 12 in Fig. 1.
  • a grid 52 has been superimposed over backboards 26.
  • Landmarks 42, 44, 46 have also been overlaid onto the backboards 26. All the landmarks 42, 44, 46 in the current scene are searched for while the system is in the search mode so that the invention will be able to locate the proper point of insertion or reference location 49 for an advertisement or commercial logo.
  • the system uses only those landmarks that it finds in the field of view.
  • Fig. 4 shows a general system diagram 60 ofthe major components ofthe system.
  • Each field of a video signal 62 enters a filtering and decimation process 64 which cleans up the current field image then decimates the field image in the same manner discussed above.
  • Decimation of the current field image is depicted in Figs, s 9A-D in which the level 0 image 152 has the highest resolution followed by the level 1 image 154, the level 2 image 156, and the level 3 image 158 having the coarsest resolution.
  • the decimated field image is then fed to landmark search and tracking board 66 which has 3 functions.
  • the first function is to search the decimated field image for landmarks.
  • the second function is to verify the position of all landmarks found with respect to the reference array.
  • the third function is to track the motion of the camera including any changes in magnification or zoom.
  • the landmark tracking board 66 collects information and generates a data signal 68 containing illumination data, magnification data, horizontal location data, and vertical location data. This data signal 68 is then fed to an occlusion processor 72.
  • the occlusion processor 72 decides whether the intended area of insert within the current image is being occluded, i.e. blocked in whole or in part by the action in the current scene.
  • the result of the occlusion processor 72 is a signal 74 containing occlusion data which is fed into the insertion processor 76.
  • the current image 62 and the insertable image are combined with the occlusion signal 74 yielding the output video image 78.
  • Fig. 5 is a block diagram ofthe live video insertion system showing the timing ofthe entire process.
  • the current system requires eight fields to accomplish the seamless insertion of a logo into a live video broadcast, an increase in processor speed would permit insertions in as few as three fields.
  • the current video signal is converted from analog to digital form by converter 82 and fed to a splitter 84 which splits the signal into its y 86 and uv 88 components.
  • the separate y and uv components ofthe field image are fed into a series of video delays 92 designed to keep the broadcast synchronized while the image processing takes place.
  • the y component 86 is fed into the filtering and decimation process 90 which corrects and decimates the field image as described above.
  • u and v images may also be filtered and decimated and further processed as described below.
  • the filtered and decimated images are fed into landmark tracking board (LTB) 94 which performs search, verify and track functions.
  • An information signal 95 containing illumination, magnification, horizontal translation, and vertical translation data ofthe current field image with respect to the reference image is generated.
  • the information signal 95 from LTB 94 is fed to a series of delays 96.
  • the LTB data signal 95 from field 2 is simultaneously fed to warper 98.
  • Warper 98 warps a pictorial representation of a portion of the reference array to the current field image to adjust for magnification and horizontal and vertical translation ofthe current field image with respect to the reference array. The portion which is warped depends on the shape and location ofthe intended insertion.
  • the filtered y, u and v components of the warped reference portion are compared to the filtered y, u and v components ofthe current video image by a comparator 104.
  • the result is a signal 105 containing values reflecting the changes ofthe y, u, and v components between the current field image and the warped reference portion. If required, these changes can be further processed to average or cluster them over time or to average or cluster them in space to smoothe the changes and enhance the reliability ofthe occlusion processor.
  • a square root calculation 106 is performed on a difference signal 105 on a pixel by pixel basis within the current field image.
  • the result is compared to a threshold value in order to locate any areas that may be occluded in the current image. If the threshold is within a defined tolerance, then no occluding object is deemed present. If, however, the resultant value exceeds the threshold, then it is deemed that an occluding object is present within the current field image.
  • the result ofthe threshold comparison is filtered to create an occlusion mask 108.
  • This mask generates an occlusion mask key that will decide whether to broadcast the insert value or current field value of a given pixel.
  • warper 97 receives the delayed LTB data signal 95 using it to warp the logo to be inserted adjusting for magnification and horizontal and vertical translation ofthe current field image. In the case that the grid has been distorted, it may be necessary to include distortion in warping the logo for insertion.
  • the warped logo 99, the occlusion mask key, and the delayed y 86 and uv 88 current field image components are inputs to a combiner 110.
  • the combiner 110 will pass either the insert image 99 or the current field image components y 86 and uv 88 to broadcast depending on mask key.
  • the resultant y 112 and uv 114 signal components are combined back to a single digital video signal.
  • the digital signal is then converted back to analog format by converter 118 prior to being broadcast live.
  • the audio signal was delayed by 8 fields to ensure that the video and audio broadcast signals are in sync with each other when broadcast.
  • a search is performed for a particular coarse feature, for example, a light to dark transition or a horizontal or vertical feature.
  • the preferred mode for conducting the search is via the Burt Pyramid algorithm.
  • the Burt Pyramid algorithm which utilizes the decimated levels 152, 154, 156, 158 of the current image 14, allows for fast searching of the lower resolution levels for rapid identification of landmarks 42, 44, 46, since the lower resolution levels have less pixels to search in order to identify a particular feature compared to searching the higher resolution levels. If a search feature or landmark is found, an additional search for the same or another feature to verify the location ofthe coarse feature is performed by searching for a similar feature at a higher level in the area ofthe image identified in the level 3 search.
  • the level 3 search can be performed using an 8 x 8 template to create, for example, a 15 x 15 correlation surface.
  • Each 8 x 8 template is tailored for a particular feature, such as a vertical line, a horizontal line, a diagonal line or a corner.
  • the search function identifies landmarks 42, 44, 46 and returns an estimate ofthe translation in the X and Y directions (Tx, Ty) and the zoom (k), which in turn is used to determine the current position and orientation ofthe landmarks 42, 44, 46 in the current image 14, compared to the location ofthe landmarks 42, 44, 46 in the reference image 48. Ifthe search mode 124 is successful, then the verify mode 126 is entered. It is often desirable to use search templates which are much larger than 8 x 8 if the features are large or the search is carried out at a low level of decimation.
  • the transition from the search mode 124 to the verify mode 126 is made ifthe search mode 124 produces a preset number of "good” landmarks.
  • a "good” landmark has a correlation value within a fixed range ofthe correlation value produced in the reference image, and satisfies predetermined continuity and line checks. Correlation of a landmark 42 that is a vertical line could be checked to make sure that three consecutive correlations have a value within a limited range of each other, and the surface could be checked to the left and the right ofthe located line to detect the absence of a line in that location.
  • Verification is conducted at level 0 or level 1 for up to three fields. If there is no successful verification, then the system returns to the search mode 124. If the verification criteria are met, then insertion of an insertable image is performed. No insertion is performed, however, unless certain criteria are met and the number of good landmarks must exceed a preset value. In addition, more than two landmarks must be vertical thereby insuring a good zoom calculation. Further, a portion of all landmarks must be "quality" landmarks.
  • a quality landmark is defined as having a distance error weighting above a predetermined value, determined as a function ofthe distance between the current landmark, i.e. the landmark in the current image, from where the previous landmark model predicted the current landmark would be.
  • a landmark model 140 is the model formed by landmarks 42, 44, 46 in each field.
  • the first landmark model is established by the landmarks 42 44 46 in the reference image 48 of Fig. 2.
  • the landmark model 140 is formed by determining a geometric relationship between the landmarks 42, 44, 46.
  • the landmarks 42, 44, 46 for the current field image 14 are compared to the landmark model 140 generated in the reference image 48 to determine the translation and zoom changes from the reference image 48 to the current field image 14.
  • the landmarks 42, 44, 46 are again located and the location of each current landmark is compared to its predicted location based on the landmark model 140 from the prior field.
  • the landmarks 42, 44, 46 in the current field image 14 are fitted to the prior landmark model 140 using a least squares fit. This comparison with the prior landmark model 140 generates a weight to be assigned to the location of each current landmark 42, 44, 46.
  • the weight assigned to each current landmark 42, 44, 46 is used in the calculation of a new landmark model 140 for the current landmarks 42, 44, 46.
  • the final verification criteria is that there must be no missing landmarks, or if a landmark is missing, it must be occluded. Moreover, if the search results are sufficiently accurate, the verify step may be eliminated.
  • the program enters the tracking mode 128.
  • the system enters the tracking mode 128, which indicates how the camera 12 is moving.
  • the system obtains information on the movement ofthe camera 12 from the current field image 14 by monitoring the motion of the landmarks 42, 44, 46.
  • the tracking functions are performed at the current level 0 image 152 or the current level 1 image 154.
  • the landmarks 42, 44, 46 in each field are collectively referred to as a landmark model 140.
  • each subsequent field is substantially similar to the previous field.
  • a pixel by pixel search using tailored templates in the extended region ofthe location of each landmark 42, 44, 46, as predicted by the previous field landmark model 140 determines the incremental change in the position ofthe scene.
  • the decimated images 152, 154, 156, 158 in levels 0-3, for example, continue to be generated for each field. While there is a selectable limit on the number of landmarks that must be present to do tracking, there must be more than two landmarks in the zoom direction (vertical) and at least one other landmark in another (e.g. horizontal) direction. If, however, the zoom measurement is lost for no more than three frames, the system will continue to operate ifthere is at least one landmark.
  • the tracking function uses Gaussian rather than the LaPlacian decimated images, which improves the signal to noise ratio and preserves valuable lower frequency information. If the tracking criteria are not met, than the system returns to the search mode 124.
  • the system performs an occlusion operation on the pattern to be inserted into the insertion region.
  • Occlusion accounts for obstacles in the actual current field image which may occlude, to some extent, the insertion region.
  • obstacles in the insertion region In order to perform a real-time insertion that is realistically merged into the current image 14, obstacles in the insertion region must be identified and the insertion adjusted, and in some cases withheld, so as not to insert an insertable image over an obstacle.
  • a filtered pictorial representation of a portion ofthe reference array is generated.
  • the reference image is filtered to reduce the high- frequency components.
  • the lower-resolution representation ofthe reference array is regularly updated for brightness via illumination-sensitive sensors which are positioned in close proximity to the insert locations within the stadium. This is done to match the brightness ofthe pictorial representation to the current field image 14.
  • the results are stored in the computer system.
  • Each current field image 14 is also filtered to reduce the high-frequency components yielding a lower- resolution representation ofthe current field image 14.
  • the filtering ofthe reference array 48 is often greater than the filtering ofthe current image 14.
  • the objects which occlude the inserted image are real physical objects, in general they will be larger than one pixel and appear in more than one frame. Therefore, the accuracy and noise ofthe occlusion processing can be additionally improved by clustering the occluding pixels into groups and by tracking their motion over time from field to field. Thus, better judgments can be made about whether a particular pixel is part of an occluding mass or not.
  • a portion of the lower resolution representation ofthe reference array 48 is adjusted for translation (location) and zoom (size), as well as for illumination, as indicated previously.
  • the modified lower resolution representation ofthe reference array 48 is then compared, on a pixel by pixel basis, with the lower resolution representation ofthe current image 14 to identify any obstacles in the reference image 48, the pixels in the reference and current images now having a 1 : 1 ratio.
  • a transparency function or mask key is determined which can then be applied to the insertable image during insertion to properly account for any obstacles that may be present in the insertion region, and thus may affect the insertion ofcertain pixels into the current image 14.
  • the insertion operation is performed as follows.
  • the insertion region has a mathematical relation to a pre-selected reference location 49.
  • the reference location 49 has a mathematical relationship with the landmark model 140 identified in the reference image 48. That is, the reference location 49 ofthe insertion region has a relative position with respect to the landmark model 140.
  • the corresponding change in the translation and zoom ofthe insertion region can be determined as a function ofthe reference location 49.
  • the X and Y translation ofthe reference location 49 is calculated, the zoom function is applied to the stored insertable image, and the insertable image is inserted into the insertion region ofthe current image 14 on a pixel by pixel basis, using the reference location 49 of the insertion region as a reference point for positioning the insertable image.
  • Fig. 13 shows a current field image 180 as seen by the television viewer. Insert 182 appears on the back wall behind home plate. This example shows a static insertion 182 that is not being occluded by the current action ofthe game. This time, however, the magnification factor is k > 1 which is a magnification.
  • Fig. 14 shows a current field image 184 as seen by the television viewer. Insert 186 appears partially obstructed on the back wall behind home plate. This example shows a static insertion 186 that is being occluded by the current action ofthe game. The system keeps the logo in the background ofthe scene. Again, the magnification factor of this example is k > 1.
  • Fig. 15 shows a current field image 188 as seen by the television viewer.
  • the inserted image 190 of a walking rabbit appears to be moving horizontally across the screen in each subsequent field. Additionally, the rabbit itself is changing shape in that its arms and legs are moving in each new field.
  • An insertable image is not limited to one region or one shape. The location and shape ofthe insert 190 may be altered from field to field by an operator on a real-time basis, or altered automatically by a preprogrammed sequence or video.
  • Fig. 16 is a schematic representation ofthe hardware implementation ofthe system and method of the preferred embodiment ofthe invention.
  • the present invention is capable of seamlessly placing an insertable image directly into a live video broadcast without having to identify any particular existing advertisement or "target zone" in the current scene. Therefore, the insertable image appears natural and seamless within the broadcast and does not pop up noticeably in the current field of view.
  • the system can easily insert a moving insertable image within the live video broadcast. Further, the system can move the insertable image two different ways within the current scene. First, the insert as a whole can change its position within the current scene. Second, the insertable image itself can change its own shape from field to field. Thus, the present invention can readily support insertion of dynamic images within a live video broadcast.
  • the system automatically adjusts the zoom factor ofthe insertable image without external sensory input. Zoom adjustments are calculated based on the spatial relationship of objects within the current scene and not on sensed physical adjustments ofthe camera itself or non-repeatable sensors on the camera. Therefore, the present invention is not susceptible to performance degradations due to unwanted camera motion.
  • the system is operated on a real-time basis in that insertable images and their points of insertion need not be run by a "canned" process.
  • the system operator can choose virtually any point of insertion within the current scene during the actual broadcast. For example, if a particular section of a stadium is relatively empty of fans the operator could insert an image over the empty seats. Thus, the system operator can use space that was not known to be available prior to the live broadcast.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Studio Circuits (AREA)
  • Closed-Circuit Television Systems (AREA)
  • Studio Devices (AREA)
EP96921559A 1995-06-16 1996-06-12 Verfahren und vorrichtung zum einfügen von dynamischen und statischen bildern in einer echtzeitvideoübertragung Withdrawn EP0832537A4 (de)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US27995P 1995-06-16 1995-06-16
US27 1995-06-16
US563598 1995-11-28
US08/563,598 US5892554A (en) 1995-11-28 1995-11-28 System and method for inserting static and dynamic images into a live video broadcast
PCT/US1996/010163 WO1997000581A1 (en) 1995-06-16 1996-06-12 System and method for inserting static and dynamic images into a live video broadcast

Publications (2)

Publication Number Publication Date
EP0832537A1 true EP0832537A1 (de) 1998-04-01
EP0832537A4 EP0832537A4 (de) 2000-10-18

Family

ID=26667432

Family Applications (1)

Application Number Title Priority Date Filing Date
EP96921559A Withdrawn EP0832537A4 (de) 1995-06-16 1996-06-12 Verfahren und vorrichtung zum einfügen von dynamischen und statischen bildern in einer echtzeitvideoübertragung

Country Status (6)

Country Link
EP (1) EP0832537A4 (de)
JP (1) JPH11507796A (de)
AU (1) AU6276096A (de)
BR (1) BR9609169A (de)
PE (1) PE18698A1 (de)
WO (1) WO1997000581A1 (de)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3724117B2 (ja) * 1997-05-23 2005-12-07 ソニー株式会社 画像生成装置および画像生成方法
FR2770072B1 (fr) * 1997-08-19 2000-07-28 Serge Dulin Camera virtuelle
GB2344714A (en) * 1998-09-22 2000-06-14 Orad Hi Tec Systems Ltd Method and apparatus for creating real digital video effects
EP1047264B1 (de) 1999-04-22 2007-05-09 Leo Vision Vorrichtung und Verfahren zur Bildverarbeitung und -wiederherstellung mit erneuter Abtastung
ES2158797B1 (es) * 1999-08-12 2002-04-01 Nieto Ramon Rivas Dispositivo generador multiuso y/o multidestino de los contenidos en modulos o paneles publicitarios, informativos u ornamentaltes y similatres, que quedan integrados en las imagenes retransmitidas y/o filmadas.
US6993245B1 (en) 1999-11-18 2006-01-31 Vulcan Patents Llc Iterative, maximally probable, batch-mode commercial detection for audiovisual content
WO2001063916A1 (en) 2000-02-25 2001-08-30 Interval Research Corporation Method and system for selecting advertisements
US6968565B1 (en) 2000-02-25 2005-11-22 Vulcan Patents Llc Detection of content display observers with prevention of unauthorized access to identification signal
US8910199B2 (en) 2000-02-25 2014-12-09 Interval Licensing Llc Targeted television content display
JP5529773B2 (ja) * 2000-04-05 2014-06-25 雅信 鯨田 番組送信等システム、番組再生等システム、及び番組へのcm提供システム
IL136080A0 (en) * 2000-05-11 2001-05-20 Yeda Res & Dev Sequence-to-sequence alignment
JP2006333420A (ja) * 2005-05-27 2006-12-07 Shinzo Ito スポーツの映像を検出した検出映像とcm映像を組合せテレビ放送用のテレビ放送映像信号を生成する方法およびテレビ放送映像信号を含む信号を送信する装置
JP5162928B2 (ja) * 2007-03-12 2013-03-13 ソニー株式会社 画像処理装置、画像処理方法、画像処理システム
ITMI20100262A1 (it) * 2010-02-19 2011-08-20 Jarno Zaffelli Metodo per la visualizzazione sostanzialmente statica di immagini in riprese dinamiche e sistema implementante lo stesso
FR2959339A1 (fr) * 2010-04-26 2011-10-28 Citiled Procede de commande d'au moins un panneau d'affichage d'images variables dans un lieu tel qu'un stade
JP5465620B2 (ja) 2010-06-25 2014-04-09 Kddi株式会社 映像コンテンツに重畳する付加情報の領域を決定する映像出力装置、プログラム及び方法
RU2612378C1 (ru) 2013-03-08 2017-03-09 ДиджитАрена СА Способ замены объектов в потоке видео
EP3094082A1 (de) * 2015-05-13 2016-11-16 AIM Sport Vision AG Digitale überlagerung eines bildes mit einem anderen bild

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1995010919A1 (en) * 1993-02-14 1995-04-20 Orad, Inc. Apparatus and method for detecting, identifying and incorporating advertisements in a video

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2661061B1 (fr) * 1990-04-11 1992-08-07 Multi Media Tech Procede et dispositif de modification de zone d'images.
ES2136603T3 (es) * 1991-07-19 1999-12-01 Princeton Video Image Inc Presentaciones televisivas con signos insertados, seleccionados.
GB9119964D0 (en) * 1991-09-18 1991-10-30 Sarnoff David Res Center Pattern-key video insertion

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1995010919A1 (en) * 1993-02-14 1995-04-20 Orad, Inc. Apparatus and method for detecting, identifying and incorporating advertisements in a video

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of WO9700581A1 *

Also Published As

Publication number Publication date
BR9609169A (pt) 1999-12-14
EP0832537A4 (de) 2000-10-18
MX9710191A (es) 1998-10-31
PE18698A1 (es) 1998-05-04
AU6276096A (en) 1997-01-15
JPH11507796A (ja) 1999-07-06
WO1997000581A1 (en) 1997-01-03

Similar Documents

Publication Publication Date Title
US5892554A (en) System and method for inserting static and dynamic images into a live video broadcast
JP3242406B2 (ja) パターン・キー挿入を用いたビデオ合併
JP3496680B2 (ja) ビデオイメージ中の複数の追跡基準領域から推定された目標領域の場所の安定的推定法
US6384871B1 (en) Method and apparatus for automatic electronic replacement of billboards in a video image
US5808695A (en) Method of tracking scene motion for live video insertion systems
WO1997000581A1 (en) System and method for inserting static and dynamic images into a live video broadcast
US5920657A (en) Method of creating a high resolution still image using a plurality of images and apparatus for practice of the method
JP2001506820A (ja) 画像テクスチャーテンプレートを用いた動き追跡
US20140369661A1 (en) System for filming a video movie
GB2305051A (en) Automatic electronic replacement of billboards in a video image
KR20030002919A (ko) 방송 영상에서의 실시간 이미지 삽입 시스템
CA2393803C (en) Method and apparatus for real time insertion of images into video
KR100466587B1 (ko) 합성영상 컨텐츠 저작도구를 위한 카메라 정보추출 방법
MXPA97010191A (en) System and method for inserting static and dynamic images in a v devide transmission
AU702724B1 (en) Image manipulation apparatus
Burt et al. Video mosaic displays
NZ624929B2 (en) System for filming a video movie

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19980105

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

A4 Supplementary search report drawn up and despatched

Effective date: 20000905

AK Designated contracting states

Kind code of ref document: A4

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

RIC1 Information provided on ipc code assigned before grant

Free format text: 7H 04N 9/74 A, 7H 04N 5/272 B

17Q First examination report despatched

Effective date: 20030204

111Z Information provided on other rights and legal means of execution

Free format text: ATBECHDEDKESFIFRGBGRIEITLILUMCNLPTSE

Effective date: 20030725

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20040930