EP1433333A1 - Verfahren und einrichtung zum codieren einer szene - Google Patents

Verfahren und einrichtung zum codieren einer szene

Info

Publication number
EP1433333A1
EP1433333A1 EP02791510A EP02791510A EP1433333A1 EP 1433333 A1 EP1433333 A1 EP 1433333A1 EP 02791510 A EP02791510 A EP 02791510A EP 02791510 A EP02791510 A EP 02791510A EP 1433333 A1 EP1433333 A1 EP 1433333A1
Authority
EP
European Patent Office
Prior art keywords
image
images
scene
composition
textures
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP02791510A
Other languages
English (en)
French (fr)
Inventor
Paul Kerbiriou
Gwena[L Kervella
Laurent Blonde
Michel Kerdranvat
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
THOMSON LICENSING
Original Assignee
Thomson Licensing SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing SAS filed Critical Thomson Licensing SAS
Publication of EP1433333A1 publication Critical patent/EP1433333A1/de
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the invention relates to a method and a device for coding and decoding a scene composed of objects whose textures come from different video sources.
  • Multimedia broadcasting systems are generally based on the transmission of video information, either via separate elementary streams, or via a transport stream multiplexing the different elementary streams, or a combination of the two.
  • This video information is received by a terminal or receiver made up of a set of elementary decoders simultaneously performing the decoding of each of the elementary streams received or demultiplexed.
  • the final image is composed from the decoded information. This is for example the case of the transmission of streams of MPEG 4 coded video data.
  • This type of advanced multimedia system attempts to offer great flexibility to the end user by offering them possibilities for composing several flows and interactivity at the terminal level.
  • the extra processing is actually quite significant if we consider the complete chain, from. the generation of simple flows to the restitution of a final image. It concerns all the levels of the chain: coding, addition of inter-stream synchronization elements and packetization, multiplexing, demultiplexing, taking into account of inter-stream synchronization elements and depacketization, decoding.
  • composition system upon reception, which produces the final image of the scene to be viewed according to the information defined by the content creator.
  • a great complexity of management at the system level or at the processing level is therefore generated.
  • the first systems therefore require the management of numerous data flows both at the transmission and reception levels. It is not possible to achieve in a simple way, a local composition or "scene" from several videos. Expensive devices such as decoders and complex management of these decoders must be put in place for the exploitation of these streams.
  • the number of decoders can be a function of the different types of coding used for the data received corresponding to each of the streams, but also the number of video objects that can compose the scene.
  • the processing time of the received signals due to centralized management of the decoders, is not optimized. The management and processing of the images obtained, because of their multitude, are complex.
  • the invention aims to overcome the aforementioned drawbacks.
  • Its subject is a method of coding a scene made up of objects whose textures are defined from images or parts of images from different video sources, characterized in that it comprises the steps:
  • auxiliary data comprising information relating to the composition of the composed image and information relating to the textures of the objects.
  • the composite image is obtained by spatial multiplexing of the images or parts of images.
  • the video sources from which the images or parts of images composing the same composed image are selected have the same coding standards.
  • the composite image may also include a still image not from a video source.
  • the dimensioning is a reduction in size obtained by subsampling.
  • the composed image is coded according to the MPEG 4 standard and the information relating to the composition of the image are the texture coordinates.
  • the invention also relates to a method for decoding a scene composed of objects, coded from a composite video image grouping images or parts of images from different video sources and from auxiliary data which is information of composition of the composite video image and of information relating to the textures of the objects, characterized in that it performs the steps of:
  • the method is characterized in that the extraction of the textures is carried out by spatial demultiplexing of the decoded image. •.
  • the method is characterized in that a texture is processed by oversampling and spatial interpolation to obtain the texture to be displayed in the final image viewing the scene.
  • the invention also relates to a device for coding a scene composed of objects whose textures are defined from images or parts of images from different video sources, characterized in that it comprises:
  • a video editing circuit receiving the different video sources for dimensioning and positioning on an image, images or parts of images originating from these video sources, for producing a composite image
  • an auxiliary data generation circuit connected to the video editing circuit to supply information relating to the composition of the composed image and information relating to the textures of the objects, a coding circuit for the composed image,
  • the invention also relates to a device for decoding a scene composed of objects, coded from a composite video image grouping together images or parts of images from different video sources and from auxiliary data which is information. of composition of the composite video image and of information relating to the textures of the objects, characterized in that it comprises:
  • auxiliary data a circuit for decoding the auxiliary data - a processing circuit receiving the auxiliary data and the decoded image for extracting textures from the decoded image from the auxiliary data for composing the image and for applying textures to objects of the scene from the auxiliary data relating to the textures.
  • the idea of the invention is to group, on an image, elements or texture elements which are images or parts of images coming from: different video sources and necessary for the construction of the scene to be visualized, so to "transport" this video information on a single image or a limited number of images.
  • a spatial composition of these elements is therefore produced and it is the overall composite image obtained which is coded instead of coding separate from each video image from video sources.
  • a scene . overall, the construction of which usually requires several video streams can be constructed from a more limited number of video streams and even from a single video stream transmitting the composed image.
  • the decoding circuits are simplified and the construction of the scene carried out in a more flexible manner .
  • QCIF format an English expression Quarter Common Intermediate Format
  • GIF Common Intermediate Format
  • the image On reception, the image is not simply presented. It is recomposed using transmitted composition information. This makes it possible to present the user with a less frozen image, potentially including an animation resulting from the composition, and to offer him further interactivity, each recomposed object being able to be active.
  • Management at the receiver is simplified, the data to be transmitted can be more compressed due to the grouping of video data on an image, the number of circuits necessary for decoding is reduced. Optimizing the number of streams minimizes the resources required in relation to the content transmitted.
  • FIG. 1 a coding device according to the invention
  • FIG. 1 represents a coding device according to the invention.
  • the circuits at 1 n symbolize the generation of: various video signals, available to the encoder for the coding of a scene to be viewed by the receiver. These signals are transmitted to a composition circuit 2 which has the function of composing an overall image from those corresponding to the signals received. The overall image obtained is called the composite image or mosaic.
  • This composition is defined on the basis of information exchanged with an auxiliary data generation circuit 4.
  • composition information making it possible to define the composed image and thus to extract, at the receiver, the various elements or sub- images composing this image, for example position and shape information in the image such as the coordinates of the vertices of rectangles if the elements constituting the transmitted image are of rectangular shape or shape descriptors.
  • This composition information makes it possible to extract textures and it is thus possible to define a library of textures for the composition of the final scene.
  • auxiliary data relate to the image composed by the circuit 2 but also to the final image representing the scene to be viewed at the receiver.
  • graphic information for example relating to geometric shapes, appearances, the composition of the scene making it possible to configure a scene represented by the final image.
  • This information defines the elements to be associated with graphic objects for the mapping of textures. They also define the possible interactivities making it possible to reconfigure the final image from these interactivities ...
  • the composition of the image to be transmitted can be optimized according to the textures necessary for the construction of the final scene.
  • the composite image generated by the composition circuit 2 is transmitted to a coding circuit 3 which performs coding of this image.
  • Auxiliary data 1 from circuit 4. are i transmitted to a coding circuit 5 which realizes, .coding of these data:
  • the outputs of coding circuits 3 and 5 are transmitted to the inputs of a multiplexing circuit 6 which multiplexes the received data, ie video data relating to the composed image and auxiliary data
  • the output of the multiplexing circuit is transmitted to the input of a transmission circuit 7 for the transmission of the multiplexed data.
  • the composite image is produced from images or parts of images of any shape extracted from video sources but may also contain still images or, in general, any type of representation. Depending on the number of sub-images to be transmitted, one or more composed images can be produced for the same instant, that is to say for a final image of the scene. In the case where the video signals use different standards, these signals can be grouped by standard of the same type for the composition of a composite image.
  • a first composition is made from all the elements to be coded according to the MPEG-2 standard, a second composition from all the elements to be coded according to the MPEG-4 standard, another from the elements to be coded according to the standard JPEG or GIF images or other, so that a single stream is emitted per type of coding and / or by media type.
  • the image composed may be a regular mosaic consisting for example of rectangles or sub-images of the same size or else an irregular mosaic.
  • the auxiliary flow transmits the data corresponding to the composition of the mosaic.
  • the composition circuit can perform the composition of the overall image 0 from enclosing rectangles or limitation windows defining the elements.
  • a choice of the elements necessary for the final scene is made by the composer.
  • These elements are extracted from images available to the composer from different video streams.
  • a spatial composition is then produced from the selected elements 5 - ; by "placing" them on a global image constituting a single video.
  • V The information about the positioning. these various elements, coordinates, dimensions, etc., are transmitted to the auxiliary data generation circuit which processes them to transmit them, on the stream.
  • composition circuit is in the known field; This is for example 0 a professional video editing tool, of the "Adobe premiere" type (Adobe
  • - .. - is a registered trademark). Thanks to such a circuit, objects can be extracted ... from video sources, for example by selecting parts of images, the images of these objects can be resized and positioned on a global image. A spatial multiplexing is for example carried out to obtain the composite image.
  • the means of constructing a scene, from which a part of the auxiliary data is generated, are also in the known field.
  • the MPEG4 standard uses the VRML language (Virtual Reality Modeling Language) or more precisely the binary language BIFS 0 (BInary Format for Scenes) which allows to define the presentation of a scene, to change it, to update it .
  • the BIFS description of a scene makes it possible to modify the properties of objects and to define their conditional behavior. It follows a hierarchical structure which is a tree description. 5
  • the data necessary for the description of a scene concern, among other things, the construction rules, the animation rules for an object, interactivity rules for another object ... They describe the final scenario. Some or all of this data constitutes the auxiliary data for the construction of the scene.
  • FIG. 2 represents a receiver for such a coded data stream.
  • the signal received at the input of the receiver 8 is transmitted to a demultiplexer 9 which separates the video stream from the auxiliary data.
  • the video stream is transmitted to a video decoding circuit 10 which decodes the overall image as it was composed at the level of the coder.
  • the auxiliary data at the output of the demultiplexer 9 are transmitted to a decoding circuit 11 which performs decoding of the auxiliary data.
  • a processing circuit 12 processes the video data and the auxiliary data coming respectively from the circuits 10 and 11 to extract the elements, the textures necessary for the
  • the recomposition information then extracting only these elements from the composed image: - -
  • the elements are extracted; - for example, by spatial demultiplexing.
  • the construction information therefore makes it possible to select only a part of the elements constituting the composed image. They also allow the user to "navigate" in the constructed scene in order to view objects of interest.
  • the navigation information from the user is for example transmitted to an input of the circuit 12 (not shown in the figure) which modifies the composition of the scene accordingly.
  • the textures transported by the composed image may not be used directly in the scene. They can, for example, be memorized by the receiver for use in offset time or for the constitution of a library used for the construction of the scene.
  • An application of the invention relates to the transmission of video data in MPEG4 standard corresponding to several programs from a single video stream or more generally the optimization of the number of streams in an MPEG4 configuration, for example for a program guide application. If, in a classic MPEG-4 configuration, it is necessary to transmit as many streams as there are videos that can be viewed at the terminal, the method described makes it possible to send a global image containing several videos and to use texture coordinates to build a new scene upon arrival.
  • FIG. 3 represents an example of a composite scene constructed from elements of a composite image.
  • the global image 14, also called composite texture is composed of several sub-images or elements or sub-textures 15, 16, 17, 18, 19.
  • the image 20, at the bottom of the figure, corresponds to the scene at view.
  • the positioning of the objects to construct this scene corresponds to the graphic image 21 which represents the graphic objects. . •
  • each of MPEG ⁇ 4 coding and according to the prior art each
  • video or still image corresponding to elements. 15 to 19 is transmitted in 1 a video or still image stream.
  • the graphic data is transmitted in the graphic stream. • ' . • • ; .
  • a global image is composed from the images relating to the different videos or still images to " form the composite image 14 represented at the top of the figure. This global image is coded.
  • Auxiliary data relating to the composition of the overall image and defining the geometric shapes are transmitted in parallel allowing the elements to be separated. The texture coordinates at the vertices, when these fields are used, allow these shapes to be textured from the composite image.
  • Auxiliary data relating to the construction of the scene and defining the graphic image 21 are transmitted.
  • the composite texture image is transmitted over the video stream.
  • the elements are coded as video objects and their geometric shapes 22, 23 and texture coordinates at the vertices (in the composite image or the composite texture) are transmitted over the graphic stream.
  • the texture coordinates are the composition information of the composed image.
  • the stream which is transmitted can be coded to the MPEG-2 standard and in this case, it is possible to exploit the functionalities of the circuits of existing platforms integrating the receivers.
  • elements supplementing the main programs can be transmitted on an additional video stream
  • MPEG-2 or MPEG-4 This flow can contain several visual elements such as logos, advertising banners, animated or not, which can be combined with one or other of the programs broadcast, at the choice of the broadcaster. These items can also be displayed based on user preferences or profile. An associated interaction can be expected.
  • Two decoding circuits are used, one for the program, one for the composite image and the auxiliary data. A spatial multiplexing is then possible of the program being broadcast with additional information coming from the composed image. . : '.
  • a single annex video stream can be used for a program package, to complete - several programs or several user profiles.
EP02791510A 2001-07-27 2002-07-24 Verfahren und einrichtung zum codieren einer szene Withdrawn EP1433333A1 (de)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
FR0110086A FR2828054B1 (fr) 2001-07-27 2001-07-27 Procede et dispositif de codage d'une scene
FR0110086 2001-07-27
PCT/FR2002/002640 WO2003013146A1 (fr) 2001-07-27 2002-07-24 Procede et dispositif de codage d'une scene

Publications (1)

Publication Number Publication Date
EP1433333A1 true EP1433333A1 (de) 2004-06-30

Family

ID=8866006

Family Applications (1)

Application Number Title Priority Date Filing Date
EP02791510A Withdrawn EP1433333A1 (de) 2001-07-27 2002-07-24 Verfahren und einrichtung zum codieren einer szene

Country Status (5)

Country Link
US (1) US20040258148A1 (de)
EP (1) EP1433333A1 (de)
JP (1) JP2004537931A (de)
FR (1) FR2828054B1 (de)
WO (1) WO2003013146A1 (de)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2438004B (en) 2006-05-08 2011-08-24 Snell & Wilcox Ltd Creation and compression of video data
DE102006027441A1 (de) * 2006-06-12 2007-12-13 Attag Gmbh Verfahren und Vorrichtung zum Erzeugen eines digitalen Transportstroms für ein Video-Programm
JP2008131569A (ja) * 2006-11-24 2008-06-05 Sony Corp 画像情報伝送システム、画像情報送信装置、画像情報受信装置、画像情報伝送方法、画像情報送信方法、画像情報受信方法
TWI382358B (zh) * 2008-07-08 2013-01-11 Nat Univ Chung Hsing 虛擬實境資料指示方法
US9602814B2 (en) 2010-01-22 2017-03-21 Thomson Licensing Methods and apparatus for sampling-based super resolution video encoding and decoding
JP5805665B2 (ja) 2010-01-22 2015-11-04 トムソン ライセンシングThomson Licensing Example−based超解像を用いたビデオ圧縮のためのデータプルーニング
WO2012033972A1 (en) 2010-09-10 2012-03-15 Thomson Licensing Methods and apparatus for pruning decision optimization in example-based data pruning compression
WO2012033971A1 (en) * 2010-09-10 2012-03-15 Thomson Licensing Recovering a pruned version of a picture in a video sequence for example - based data pruning using intra- frame patch similarity
US8724696B2 (en) * 2010-09-23 2014-05-13 Vmware, Inc. System and method for transmitting video and user interface elements

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5325449A (en) * 1992-05-15 1994-06-28 David Sarnoff Research Center, Inc. Method for fusing images and apparatus therefor
GB9502006D0 (en) * 1995-02-02 1995-03-22 Ntl Transmission system
US5657096A (en) * 1995-05-03 1997-08-12 Lukacs; Michael Edward Real time video conferencing system and method with multilayer keying of multiple video images
JP2962348B2 (ja) * 1996-02-08 1999-10-12 日本電気株式会社 画像符号変換方式
JPH1040357A (ja) * 1996-07-24 1998-02-13 Nippon Telegr & Teleph Corp <Ntt> 映像作成方法
FR2786353B1 (fr) * 1998-11-25 2001-02-09 Thomson Multimedia Sa Procede et dispositif de codage d'images selon la norme mpeg pour l'incrustation d'imagettes
US6405095B1 (en) * 1999-05-25 2002-06-11 Nanotek Instruments, Inc. Rapid prototyping and tooling system
US7015954B1 (en) * 1999-08-09 2006-03-21 Fuji Xerox Co., Ltd. Automatic video system using multiple cameras
US6714202B2 (en) * 1999-12-02 2004-03-30 Canon Kabushiki Kaisha Method for encoding animation in an image file
US6791574B2 (en) * 2000-08-29 2004-09-14 Sony Electronics Inc. Method and apparatus for optimized distortion correction for add-on graphics for real time video
US7827488B2 (en) * 2000-11-27 2010-11-02 Sitrick David H Image tracking and substitution system and methodology for audio-visual presentations
US7027655B2 (en) * 2001-03-29 2006-04-11 Electronics For Imaging, Inc. Digital image compression with spatially varying quality levels determined by identifying areas of interest
WO2003003720A1 (en) * 2001-06-28 2003-01-09 Omnivee Inc. Method and apparatus for control and processing of video images

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO03013146A1 *

Also Published As

Publication number Publication date
JP2004537931A (ja) 2004-12-16
US20040258148A1 (en) 2004-12-23
FR2828054A1 (fr) 2003-01-31
FR2828054B1 (fr) 2003-11-28
WO2003013146A1 (fr) 2003-02-13

Similar Documents

Publication Publication Date Title
US11087549B2 (en) Methods and apparatuses for dynamic navigable 360 degree environments
EP1233614B1 (de) System zur Videoübertragung und Videoverarbeitung, um ein Benutzermosaik zu erzeugen
US20080101456A1 (en) Method for insertion and overlay of media content upon an underlying visual media
EP2338278B1 (de) Verfahren zur Darstellung einer interaktiven Video / Multimedia-Anwendung mit Metadaten, die die Inhalte berücksichtigt
US20070005795A1 (en) Object oriented video system
EP1255409A1 (de) Konvertierung zwischen textuellem und binärem BIFS Format
EP1433333A1 (de) Verfahren und einrichtung zum codieren einer szene
EP2382756B1 (de) Verfahren zur modellisierung der anzeige eines endgerätes durch macroblocks mittels durch bewegungsvektor und durchsichtigkeitsdaten gekennzeichneten masken
JP4272891B2 (ja) 相互的な測光効果を発生させる、装置、サーバ、システム及び方法
WO2021109412A1 (en) Volumetric visual media process methods and apparatus
US7439976B2 (en) Visual communication signal
EP1236352B1 (de) Digitalfernsehrundfunkverfahren, entsprechendes digitalsignal und entsprechende vorrichtung
EP1354479B1 (de) Verfahren und vorrichtung zur verwaltung von interaktionen im standard mpeg-4
CN115002470A (zh) 一种媒体数据处理方法、装置、设备以及可读存储介质
Bove Object-oriented television
US20120019621A1 (en) Transmission of 3D models
Deshpande et al. Omnidirectional MediA Format (OMAF): toolbox for virtual reality services
KR20030005178A (ko) 여러 데이터로부터의 비디오 장면 구성을 위한 방법 및 장치
Kauff et al. The MPEG-4 standard and its applications in virtual 3D environments
FR2780843A1 (fr) Procede de traitement de donnees video destinees a etre visualisees sur ecran et dispositif mettant en oeuvre le procede
EP4078971A1 (de) Verfahren und vorrichtungen zum codieren, decodieren und wiedergeben von 6dof-inhalten aus zusammengesetzten 3dof+ elementen
FR2940703B1 (fr) Procede et dispositif de modelisation d&#39;un affichage
Arsov A framework for distributed 3D graphics applications based on compression and streaming
Lim et al. MPEG Multimedia Scene Representation
Kitson Multimedia, visual computing, and the information superhighway

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20040128

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LI LU MC NL PT SE SK TR

AX Request for extension of the european patent

Extension state: AL LT LV MK RO SI

RIN1 Information on inventor provided before grant (corrected)

Inventor name: KERDRANVAT, MICHEL

Inventor name: BLONDE, LAURENT

Inventor name: KERVELLA, GWENAEL

Inventor name: KERBIRIOU, PAUL

RIN1 Information on inventor provided before grant (corrected)

Inventor name: KERDRANVAT, MICHEL

Inventor name: BLONDE, LAURENT

Inventor name: KERVELLA, GWENAEL

Inventor name: KERBIRIOU, PAUL

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: THOMSON LICENSING

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: THOMSON LICENSING

17Q First examination report despatched

Effective date: 20100709

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20101120