KR20020032862A

KR20020032862A - An object-based multimedia service system and a service method using a moving picture encoding

Info

Publication number: KR20020032862A
Application number: KR1020000063577A
Authority: KR
Inventors: 손세훈; 신재섭; 김연배; 류성걸; 최영민; 이형준
Original assignee: 신재섭; (주) 엠펙솔루션
Priority date: 2000-10-27
Filing date: 2000-10-27
Publication date: 2002-05-04

Abstract

PURPOSE: An object-base multimedia service system and service method using motion picture coding are provided to offer motion pictures having high picture quality to a user by introducing a shape selector. CONSTITUTION: A multimedia service system that allows motion picture data to be transmitted from a terminal of a service provider to a user terminal through a network includes an interactive interface(250) and a shape selector(110). The interactive interface controls output of at least one motion picture at the request of a user, and allows the user to be able to select desired shape information. The shape selector selects the shape information the user wants from the motion picture data.

Description

OBJECT-BASED MULTIMEDIA SERVICE SYSTEM AND A SERVICE METHOD USING A MOVING PICTURE ENCODING

본 발명은 멀티미디어에 관한 것으로써, 특히 특정 관심 영역의 형상 정보를 이용하여 동영상을 부호화하고 복호화하며, 이와 같은 형상 정보를 이용하여 동영상을 재생하는 멀티미디어 서비스 시스템 및 서비스 방법에 관한 것이다.The present invention relates to multimedia, and more particularly, to a multimedia service system and a service method for encoding and decoding a video using shape information of a specific ROI, and playing back a video using such shape information.

멀티미디어(Multimedia)는 최근 여러 분야와 업계의 많은 사람들로부터 관심을 모으고 있다. 멀티미디어는 PC에 국한되지 않고, 통신, 방송, 가전, 컴퓨터 등 보다 광범위한 개념으로 이해되고 있다. 미디어(협의)란 정보속성을 표현하는 수단(표현 미디어)과 정보를 물리적으로 전달하는 수단(전달 미디어)의 총칭이다. 따라서, 멀티미디어란 복수의 표현 미디어를 동일한 전달 미디어에 의해 통합적으로 취급하는 것을 말한다. 이러한 멀티미디어는 표현뿐만 아니라 전달 측면에서도 중요한 특징을 갖는다. 특히, 정보의 전달측면에서, 어떻게 표현 미디어를 효율적으로 압축하여 전달하는지에 관심이 모아졌다. 종래의 JPEG(Joint Photographic coding Experts Group, 컬러 정지화압축의 국제표준)에서 MPEG(Moving Picture coding Experts Group) 알고리즘에 의해 영상의 압축이 발전해 왔다.Multimedia has recently attracted attention from many people in many fields and industries. Multimedia is not limited to PC, but is understood as a broader concept such as communication, broadcasting, home appliances, and computers. Media (consensus) is a generic term for a means of expressing information attributes (expression media) and a means of physically delivering information (delivery media). Thus, multimedia refers to the integrated handling of multiple expression media by the same delivery media. Such multimedia has important features in terms of expression as well as delivery. In particular, in terms of information delivery, attention has been focused on how to efficiently compress and deliver the expression media. Image compression has been developed by the Moving Picture coding Experts Group (MPEG) algorithm in the conventional JPEG (Joint Photographic Coding Experts Group).

이하에서는 종래의 동영상 방법에 대하여 첨부한 도면을 참조하여 간단히 설명한다.Hereinafter, a conventional video method will be briefly described with reference to the accompanying drawings.

도1a 및 도1b는 종래의 동영상 부호화 및 복호화 방법을 개괄적으로 나타내는 도면이다.1A and 1B schematically illustrate a conventional video encoding and decoding method.

첨부한 도1a에서와 같이, 종래의 일반적인 동영상 부호화 및 복호화 방법은 단순히 2차원 영상을 그대로 부호화 하여 송신하고 이를 수신하여 복호화한 후 재생한다. 이러한 방법보다 개선된 방법으로, MPEG-4에서는 도1b와 같이 형상 부호화/복호화기를 도입하여 형상 내의 화상 정보와 형상에 대해 부호화 하여 혼합된 부호화 정보를 전송한 후 이를 다시 복호화 함으로써 특별히 의미있는 2차원 영상내의 개체를 독립적으로 분리하여 처리할 수 있도록 하였다.As shown in FIG. 1A, the conventional video encoding and decoding method simply encodes and transmits a 2D video as it is, receives it, decodes it, and plays it back. As an improvement over this method, MPEG-4 introduces a shape encoder / decoder as shown in FIG. The objects in the image can be separated and processed independently.

그러나, 종래의 이러한 부호화 및 복호화 방법은 의미있는 개체에 대한 형상 정보 추출을 정확하고 빠른 시간 내에 처리해야하며, 제한된 시스템에서 많은 데이터량을 처리하다보니 부호화 및 복호화 효율이 떨어져 영상의 화질이 나빠지는 문제점이 있다.However, such a conventional encoding and decoding method needs to process shape information extraction on a meaningful object in an accurate and fast time, and processing a large amount of data in a limited system results in poor encoding and decoding efficiency, resulting in poor image quality. There is a problem.

이와 같은 문제점을 해결하기 위해, 본 발명이 이루고자 하는 기술적인 과제는 형상 선택기를 도입하여 사용자로 하여금 고화질의 동영상 서비스를 제공받을 수 있도록 하고자 한다.In order to solve such a problem, the technical problem to be achieved by the present invention is to introduce a shape selector so that the user can be provided with a high-definition video service.

또한, 실시간 객체 기반의 멀티미디어 서비스를 제공하여, 사용자가 대화형 인터페이스를 통해 제공되는 영상 컨텐츠의 수, 크기, 위치 등을 선택하고 조정할 수 있도록 하는 데 있다.In addition, by providing a real-time object-based multimedia service, the user can select and adjust the number, size, location, etc. of the video content provided through the interactive interface.

도2는 본 발명의 제1 실시 예에 따른 대화형 멀티미디어 서비스 시스템을 나타내는 구성 블록도 이다.2 is a block diagram illustrating an interactive multimedia service system according to a first embodiment of the present invention.

도3은 본 발명의 제1 실시 예에 따른 대화형 멀티미디어 서비스 방법을 화면을 통해 보여주는 도면이다.3 is a diagram illustrating an interactive multimedia service method according to a first embodiment of the present invention through a screen;

도4는 본 발명의 제1 실시 예의 다른 예에 따른 대화형 다 객체 멀티미디어 서비스 시스템을 나타내는 구성 블록도 이다.4 is a block diagram illustrating an interactive multi-object multimedia service system according to another example of the first embodiment of the present invention.

도5는 본 발명의 제1 실시 예의 다른 예에 따른 대화형 다 객체 멀티미디어 서비스 방법을 화면을 통해 보여주는 도면이다.5 is a diagram illustrating an interactive multi-object multimedia service method according to another example of the first embodiment of the present invention through a screen.

도6은 사용자의 여건에 따라 재생 가능한 대화형 다 객체 멀티미디어 서비스 방법을 화면을 통해 보여주는 도면이다.6 is a diagram illustrating an interactive multi-object multimedia service method that can be reproduced according to a user's condition on a screen.

도7은 본 발명의 제2 실시 예에 따른 대화형 다 객체 화상전화/화상회의 단말기를 나타내는 구성 블록도 이다.7 is a block diagram illustrating an interactive multi-object videophone / videoconference terminal according to a second embodiment of the present invention.

도8은 본 발명의 제2 실시 예의 다른 예에 따른 업 스티리밍이 가능한 대화형 다 객체 화상전화/화상회의 단말기를 나타내는 구성 블록도 이다.Fig. 8 is a block diagram showing an upstreaming interactive multi-object videophone / videoconference terminal according to another example of the second embodiment of the present invention.

이와 같은 목적을 달성하기 위한 본 발명의 하나의 특징에 따른 동영상 부호화를 이용한 객체기반 멀티미디어 서비스 시스템은,An object-based multimedia service system using video encoding according to an aspect of the present invention for achieving the above object,

네트워크를 통해 서비스 제공자의 단말기로부터 사용자 단말기로 동영상 데이터를 전송할 수 있도록 지원하는 멀티미디어 서비스 시스템으로서,A multimedia service system for transmitting video data from a terminal of a service provider to a user terminal through a network.

사용자의 요구 사항을 입력받아 하나 이상의 동영상의 출력을 제어하고, 상기 사용자가 원하는 형상 정보를 선택할 수 있도록 하는 대화형 인터페이스; 및An interactive interface for receiving a user's requirements to control output of one or more videos and allowing the user to select desired shape information; And

상기 대화형 인터페이스로부터 형상 정보의 결정 결과를 전달받아, 상기 사용자의 의도에 따른 형상 정보를 상기 동영상 데이터로부터 선택하는 형상 선택기A shape selector which receives a result of determining shape information from the interactive interface and selects shape information according to the user's intention from the video data;

를 포함한다.It includes.

본 발명의 다른 특징에 따른 동영상 부호화를 이용한 객체기반 멀티미디어 서비스 방법은,Object-based multimedia service method using video encoding according to another aspect of the present invention,

네트워크를 통해 서비스 제공자의 단말기로부터 사용자 단말기로 동영상 데이터를 전송할 수 있도록 지원하는 멀티미디어 서비스 시스템을 이용한 서비스 방법으로서,A service method using a multimedia service system that supports transmission of video data from a terminal of a service provider to a user terminal through a network,

대화형 인터페이스로부터 사용자의 요청을 수신하여 형상선택기로 하여금 사용자 요청에 따른 형상을 선택하도록 하는 단계;Receiving a request from a user from an interactive interface and causing the shape selector to select a shape according to the user request;

선택된 형상 및 형상 내의 영상을 부호화 하여 다중 송신하면, 상기 네트워크를 통해 전송된 정보를 다중 수신하여 복호화 하여 상기 형상과 하나 이상의 멀티미디터 컨텐츠를 합성하는 단계Synthesizing the shape with one or more multimedia data by multi-receiving and decoding information transmitted through the network when the image and the image within the shape are encoded and multi-transmitted.

를 포함한다.It includes.

이하에서는 본 발명의 기술 분야에서 통상의 지식을 가진 자가 본 발명을 용이하게 실시할 수 있는 가장 바람직한 실시 예를 첨부한 도면을 참조하여 상세히 설명한다.DETAILED DESCRIPTION Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

첨부한 도2에서와 같이, 본 발명의 제1 실시 예에 따른 대화형 멀티미디어 서비스 시스템은 서비스 제공자 단말기(100)로부터 부호화 된 동영상이 도시하지 않은 네트워크를 통해 전송되면 사용자 단말기(200)에서 복호화 하여 디스플레이 한다.As shown in FIG. 2, the interactive multimedia service system according to the first embodiment of the present invention decodes the encoded video from the user terminal 200 when the encoded video is transmitted from the service provider terminal 100 through a network (not shown). Display.

서비스 제공자 단말기(100)는 도시하지 않은 동영상 입력기로부터 동영상을 입력받아 사용자에 의해 선택된 영상 또는 시스템의 디폴트 값, 또는 입력 영상에 대한 분석을 통해 결정하는 형상 선택기(110); 형상 선택기(110)에 의해 결정된 영상에 대한 영상, 음성, 문자정보를 각각 입력받아 엠펙-4 기술을 이용하여 영상, 음성, 문자정보를 계층구조로 구성된 객체단위(AVO : Audio/Visual Object)의 복합체로 구성하여 부호화 하는 부호화부(120); 부호화 된 영상 데이터를 보관하는 부호화 버퍼(130); 부호화 된 영상, 음성, 문자정보에 대한 정보 송신 타이밍을 맞추어, 동조된 영상, 음성, 문자정보를 송신하는 다중 송신기(140)를 포함한다.The service provider terminal 100 may include: a shape selector 110 which receives a video from a video inputter (not shown) and determines a default value of an image or a system selected by a user, or analyzes the input image; Receives video, audio, and text information about the image determined by the shape selector 110, respectively, and uses the MPEG-4 technology to construct video, audio, and text information in an object unit (AVO: Audio / Visual Object) structured in a hierarchical structure. An encoding unit 120 configured to encode a complex; An encoding buffer 130 for storing encoded image data; It includes a multiple transmitter 140 for transmitting the tuned video, voice, text information in accordance with the information transmission timing for the encoded video, voice, text information.

형상 선택기(110)는 미리 정해진 특정 디폴트(default)형상을 사용하는 디폴트 형상부(111); 입력된 동영상에 대한 형상 분석에 의해 결정된 선택 형상을 사용하는 영상 형상 분석부(112); 선택된 형상을 추출하여 사용하는 형상 추출부(113);대화형 인터페이스(260)를 통해 사용자에 의해 선택된 형상을 사용하는 사용자 선택 형상부(114)을 포함한다.The shape selector 110 may include a default shape 111 using a predetermined specific default shape; An image shape analyzer 112 that uses the selected shape determined by shape analysis of the input video; The shape extracting unit 113 extracts and uses the selected shape; The user select shape 114 uses the shape selected by the user through the interactive interface 260.

부호화부(120)는, 형상 선택기(110)로부터 선택된 형상을 부호화 하는 형상 부호화기(121); 선택된 형상을 이용하여 입력된 동영상 데이터로부터 선택 형상 내의 영상 데이터를 얻어 부호화 하는 영상 부호화기(122)를 더 포함한다.The encoder 120 includes a shape encoder 121 for encoding a shape selected from the shape selector 110; The apparatus may further include an image encoder 122 that obtains and encodes image data in the selected shape from the input video data by using the selected shape.

사용자 단말기(200)는, 다중 송신기(140)로부터 네트워크를 통해 전송된 형상 및 영상 비트스트림을 수신하여 재구성하는 다중 수신기(210); 다중 수신기(210)에 의해 전송된 비트스트림을 저장하고, 재생 타이밍을 고려하여 하기 복호화부(230)에 비트스트림을 전달하는 복호화 버퍼(220); 복호화 버퍼(220)로부터 입력된 형상 및 영상 데이터에 대한 비트스트림을 복호화 하는 복호화부(230); 복호화부(230)로부터 복호화 된 형상 및 영상 데이터를 합성하여 디스플레이 될 영상 객체를 구성하는 형상 영상 합성기(240); 합성된 영상 객체를 출력하여 재생하는 도시하지 않은 동영상 재생기와, 사용자의 의도에 따라 선택된 동영상 출력 방법을 적용하여 형상 선택기(110)를 제어하는 대화형 인터페이스(250)를 포함한다.The user terminal 200 may include a multiple receiver 210 configured to receive and reconstruct a shape and video bitstream transmitted from a multiple transmitter 140 through a network; A decoding buffer 220 for storing the bitstream transmitted by the multiple receiver 210 and transferring the bitstream to the following decoder 230 in consideration of the reproduction timing; A decoder 230 for decoding a bitstream of shape and image data input from the decoding buffer 220; A shape image synthesizer 240 constituting an image object to be displayed by synthesizing the shape and image data decoded by the decoder 230; A video player (not shown) for outputting and playing the synthesized image object, and an interactive interface 250 for controlling the shape selector 110 by applying a video output method selected according to a user's intention.

이하에서는 본 발명의 제1 실시 예에 따른 대화형 멀티미디어 시스템의 작용에 대하여 첨부한 도면을 참조하여 상세히 설명한다.Hereinafter, the operation of the interactive multimedia system according to the first embodiment of the present invention will be described in detail with reference to the accompanying drawings.

사용자는 대화형 인터페이스(250)를 통해 사용자가 원하는 형상을 선택한다. 형상 선택기(110)의 사용자 선택 형상부(114)는 대화형 인터페이스(250)로부터 전송된 사용자 선택 형상을 추출하여 형상 정보를 부호화부(120)에 전달한다. 형상 부호화기(121)는 첨부한 도3에서와 같이 선택된 형상에 대한 부호화를 수행하고, 영상 부호화기(122)는 선택된 형상을 이용하여 입력된 동영상 데이터로부터 선택 형상 내의 영상 데이터를 얻어 부호화한다. 부호화부(120)는 부호화 버퍼(130)에 부호화부(120)에서 출력된 압축된 비트스트림을 동기(또는 타이밍(timing))을 고려하여 보관하고, 정보 전송을 위해 압축된 비트스트림을 다중 송신기(140)로 출력한다. 다중 송신기(140)는 입력된 비트스트림을 재구성한 후 사용자 단말기(200)의 다중 수신기(210)로 전송한다.The user selects the shape desired by the user through the interactive interface 250. The user selection feature 114 of the shape selector 110 extracts the user selection shape transmitted from the interactive interface 250 and transmits the shape information to the encoder 120. The shape encoder 121 performs encoding on the selected shape as shown in FIG. 3, and the image encoder 122 obtains and encodes image data in the selected shape from the input video data by using the selected shape. The encoder 120 stores the compressed bitstream output from the encoder 120 in the encoding buffer 130 in consideration of synchronization (or timing) and stores the compressed bitstream for information transmission. Output to 140. The multiple transmitter 140 reconstructs the input bitstream and transmits the received bitstream to the multiple receiver 210 of the user terminal 200.

다중 수신기(210)는 전송된 압축 비트스트림을 객체별로 분리하여 재구성하고, 복호화 버퍼(220)는 다중 수신기(210)로부터 입력받은 비트스트림의 재생 타이밍을 고려하여 저장한 후 복호화부(230)에 전달한다. 형상 복호화기(231)는 복호화 버퍼(220)로부터 입력된 형상 정보와 관련된 비트스트림을 복호화 하고, 영상 복호화기(232)는 형상 복호화기(231)로부터 출력되는 형상 정보와 복호화 버퍼(220)를 통해 입력되는 영상 정보에 관련된 비트스트림을 사용하여 영상 정보를 복호화 한다.The multiple receiver 210 separates and reconstructs the transmitted compressed bitstream for each object, and the decoding buffer 220 is stored in consideration of the reproduction timing of the bitstream received from the multiple receiver 210 and then stored in the decoder 230. To pass. The shape decoder 231 decodes the bitstream associated with the shape information input from the decoding buffer 220, and the image decoder 232 decodes the shape information and the decoding buffer 220 output from the shape decoder 231. Decode the image information by using a bitstream related to the image information input through.

도4는 본 발명의 제1 실시 예의 다른 예에 따른 대화형 다 객체 멀티미디어 서비스 시스템을 나타내는 구성 블록도 이고, 도5는 본 발명의 제1 실시 예의 다른 예에 따른 대화형 다 객체 멀티미디어 서비스 방법을 화면을 통해 보여주는 도면이며, 도6은 사용자의 여건에 따라 재생 가능한 대화형 다 객체 멀티미디어 서비스 방법을 화면을 통해 보여주는 도면이다.4 is a block diagram illustrating an interactive multi-object multimedia service system according to another example of the first embodiment of the present invention, and FIG. 5 illustrates a method of an interactive multi-object multimedia service according to another example of the first embodiment of the present invention. FIG. 6 is a diagram illustrating an interactive multi-object multimedia service method that can be reproduced according to a user's condition on a screen.

본 발명의 제1 실시 예의 다른 예에 따른 대화형 다 객체 멀티미디어 서비스 시스템은 본 발명의 제1 실시 예의 구성 요소와 동일한 부호를 사용하였으며, 이사에서는 본 발명의 제1 실시 예와 동일한 부분에 대한 설명은 생략한다.The interactive multi-object multimedia service system according to another example of the first embodiment of the present invention uses the same codes as the components of the first embodiment of the present invention. Is omitted.

첨부한 도4에서와 같이, 하나 이상의 동영상이 입력되면 하나 이상의 형상 선택기(110)가 대화형 인터페이스(250)에 의해 결정된 결과에 의해 형상을 선택하고, 하나 이상의 부호화부(120)는 각각의 형상 선택기(110)에 의해 선택된 형상을 부호화 한다. 각각의 부호화 버퍼(130)는 부호화 된 형상 및 영상 데이터를 저장하고, 다중 송신기(140)가 형상 및 영상 데이터와 하나 이상의 멀티미디어 컨텐츠와 합성하여 전송한다. 다중 수신기(210)는 수신된 각각의 형상 및 영상 데이터와 멀티미디어 컨텐츠를 분리하여 복호화 버퍼(220)에 저장하며, 복호화부(230)는 해당 데이터에 대한 정보를 복호화 버퍼(220)에서 읽어들여 복호화 하고, 객체 영상 합성기는 각각의 복호화부(230)에서 복호화 된 형상 및 영상 데이터와 멀티미디어 컨텐츠를 합성하여 동영상 출력기(260)를 통해 디스플레이 한다.As shown in FIG. 4, when one or more videos are input, one or more shape selectors 110 select a shape based on a result determined by the interactive interface 250, and one or more encoders 120 each shape. The shape selected by the selector 110 is encoded. Each encoding buffer 130 stores the encoded shape and image data, and the multi-transmitter 140 combines the shape and image data with one or more multimedia contents and transmits them. The multiple receiver 210 separates each of the received shape, image data, and multimedia content into a decoding buffer 220, and the decoding unit 230 reads information about the data from the decoding buffer 220 and decodes it. The object image synthesizer synthesizes the decoded shape, the image data and the multimedia content in each decoder 230 and displays the same through the video output unit 260.

첨부한 도6에서와 같이, 본 발명의 부호화부(120) 및 복호화부(230)는 MPEG-4 코어 프로파일(Core Profile)의 영상 및 형상 정보 부호화/복호화 기술을 사용한다. MPEG-4 코어 프로파일(Core Profile)은 기존의 일반적인 형태의 2차원 동영상 부호화 방법과 함께 형상 정보 부호화기를 추가로 포함하고 있어, 동영상내의 의미있는 객체를 분리하여 독립적으로 부호화/복호화하여 사용함으로써 여러 가지 부가적인 기능성을 보장 받을 수 있는 기술이다. 따라서 본 발명에서는 관심 영역을 나타내는 형상 정보와 그 내부의 영상 정보를 MPEG-4의 객체 기반 형상 부호화/복호화 및 영상 부호화/복호화 한다. 또한, 첨부한 도5와 같은 서비스 방법에서, 작은 광고 영상 (멀티미디어 컨텐츠B)은 시간적 계층 부호화(Temporal Scalability) 및 공간적 계층 부호화(Spatial Scalability)를 사용하여 초기에는 디폴트로 기저계층(Base Layer)을 사용하여 서비스하고, 사용자의 선택에 따라 둘 중 하나 또는 모두를 향상계층(Enhancement Layer)으로 서비스한다. 이때 시간적 계층 부호화는 동영상의 초당 서비스 될 프레임 수를 계층적으로 부호화하여 서비스하는 방법으로 예를 들면, 원본 동영상이 초당 30장을 서비스 할 수 있을 경우, 기저계층에서는 초당 10장의 동영상을 샘플링하여 서비스하고, 여기에 부가적인 향상 계층은 나머지 초당 20장의 정보를 추가하여 총 30장의 완전한 서비스할 수 있도록 하는 부호화기술을 의미하며, 공간적 계층 부호화는 동영상의 공간 해상도를 낮은 크기부터 높은 크기까지 계층적으로 부호화하여 서비스하는 방법으로 예를 들면, 원본 동영상이 352x288 화소의 해상도를 가진다면, 기저계층으로 176x144 화소의 해상도를 가지는 동영상으로 샘플링하여 부호화하여 서비스하고, 여기에 부가적인 향상 계층은 기저계층의 동영상 정보에 부가정보를 추가함으로써 352x288의 원본 해상도를 가지는 완전한 서비스를 할 수 있도록 하는 부호화 기술이다.As shown in FIG. 6, the encoder 120 and the decoder 230 of the present invention use an MPEG-4 core profile image and shape information encoding / decoding technique. The MPEG-4 Core Profile includes a shape information encoder in addition to the conventional two-dimensional video encoding method, and separates and encodes and decodes meaningful objects in a video. It is a technology that can guarantee additional functionality. Therefore, in the present invention, object-based shape encoding / decoding and image encoding / decoding of MPEG-4 are performed on shape information indicating a region of interest and image information therein. In addition, in the service method as shown in FIG. 5, the small advertisement image (multimedia content B) uses the base layer by using temporal scalability and spatial scalability. Service, and either or both of them as an enhancement layer, depending on the user's choice. At this time, temporal hierarchical coding is a method of hierarchically encoding the number of frames to be serviced per second. For example, if the original video can service 30 frames per second, the base layer samples 10 videos per second. In addition, the additional enhancement layer refers to an encoding technique that adds 20 pieces of information per second to allow a total of 30 pieces of service, and spatial layer encoding hierarchically increases the spatial resolution of a video from low to high size. For example, if the original video has a resolution of 352x288 pixels, the base layer is sampled and coded into a video having a resolution of 176x144 pixels as the base layer, and an additional enhancement layer is a base layer video. Original resolution of 352x288 by adding additional information to the information It is an encoding technique that enables complete service with degrees.

도6에서, MPEG-4의 형상 및 영상 Spatial Scalability를 사용하여 영상 객체2를 부호화/복호화 할 때, 영상 객체1의 정보를 사용하여 부호화/복호화 효율을 높인다. 따라서, 사용자는 영상 객체1로 서비스를 받을 수 있을 뿐만 아니라, 사용자의 선택에 따라 최적의 정보량을 갖는 Enhancement Layer 비트스트림을 사용하여 영상 객체2를 서비스 받을 수 있다.In FIG. 6, when encoding / decoding the image object 2 using the shape of MPEG-4 and the image spatial scalability, the information of the image object 1 is used to increase the encoding / decoding efficiency. Accordingly, the user may not only receive the service as the image object 1 but also receive the image object 2 by using an enhancement layer bitstream having an optimal amount of information according to the user's selection.

이하에서는 본 발명의 제2 실시 예에 따른 대화형 다 객체 멀티미디어 서비스 시스템의 작용에 대하여 첨부한 도면을 참조하여 설명한다.Hereinafter, the operation of the interactive multi-object multimedia service system according to the second embodiment of the present invention will be described with reference to the accompanying drawings.

본 발명의 제2 실시 예에 따른 대화형 다 객체 멀티미디어 서비스 시스템에서 본 발명의 제1 실시 예와 동일한 부분에 대한 설명은 생략한다.In the interactive multi-object multimedia service system according to the second embodiment of the present invention, description of the same parts as in the first embodiment of the present invention will be omitted.

본 발명의 제3 실시 예에 따른 대화형 다 객체 화상전화/화상회의 단말기는, 카메라 등을 통해 동영상 정보를 입력받는 동영상 입력기(210); 하나 이상의 컨텐츠에 대해 사용자의 선택에 따라 결정된 형상 및 영상 데이터에 대한 부호화 및 형상 및 영상 데이터의 송신을 처리하는 컨텐츠 부호화/송신부(220); 전송된 형상 및 영상 데이터를 수신하여 복호화 하는 컨텐츠 복호화/수신부(230); 수신된 형상 및 영상 데이터를 표현하는 디스플레이부(240); 사용자가 원하는 형상, 위치, 해상도, 화질, 비트스트림 형식을 결정할 수 있도록 하는 대화형 인터페이스(250)를 포함한다.Interactive multi-object video telephony / video conferencing terminal according to a third embodiment of the present invention, the video input unit 210 receives video information through a camera or the like; A content encoding / transmitting unit 220 which processes encoding of shape and image data determined by a user's selection of one or more contents and transmission of the shape and image data; A content decoding / receiving unit 230 which receives and decodes the transmitted shape and image data; A display unit 240 for representing the received shape and image data; It includes an interactive interface 250 that allows a user to determine the desired shape, location, resolution, picture quality, and bitstream format.

컨텐츠 부호화/송신부(220)는 하나 이상의 컨텐츠에 대해 대화형 인터페이 스(250)로부터 사용자의 결정 내용에 따라 선택된 컨텐츠 또는 시스템의 디폴트 값, 또는 입력 영상에 대한 분석을 통해 형상을 결정하는 형상 선택기(221); 형상 선택기(221)에 의해 결정된 데이터를 각각 입력받아 영상, 음성, 문자정보를 계층구조로 구성된 객체단위의 복합체로 구성하여 부호화 하는 부호화부(222); 부호화 된 영상 데이터를 보관하는 부호화 버퍼(223); 부호화 된 영상, 음성, 문자정보에대한 정보 송신 타이밍을 맞추어, 동조된 영상, 음성, 문자정보를 송신하는 다중 송신기(224)를 포함한다.The content encoder / transmitter 220 may determine a shape by analyzing the input value or the default value of the content or system selected from the interactive interface 250 according to the user's decision about the one or more contents ( 221); An encoder 222 which receives data determined by the shape selector 221 and configures and encodes image, voice, and text information into a complex of object units having a hierarchical structure; An encoding buffer 223 for storing encoded image data; It includes a multiple transmitter 224 for transmitting the tuned video, audio, text information in accordance with the information transmission timing for the encoded video, audio, text information.

컨텐츠 복호화/수신부(230)는, 다중 송신기(224)로부터 네트워크를 통해 전송된 형상 및 영상 비트스트림을 수신하여 재구성하는 다중 수신기(231); 다중 수신기(231)에 의해 전송된 비트스트림을 저장하고, 재생 타이밍을 고려하여 하기 복호화부(233)에 비트스트림을 전달하는 복호화 버퍼(232); 복호화 버퍼(232)로부터 입력된 형상 및 영상 데이터에 대한 비트스트림을 복호화 하는 복호화부(233); 복호화부(233)로부터 복호화 된 형상 및 영상 데이터를 합성하여 디스플레이 될 영상 객체를 구성하는 객체 영상 합성기(234)를 포함한다.The content decoding / receiving unit 230 includes: a multiple receiver 231 for receiving and reconstructing a shape and video bitstream transmitted from the multiple transmitter 224 through a network; A decoding buffer 232 for storing the bitstream transmitted by the multiple receiver 231 and transferring the bitstream to the following decoding unit 233 in consideration of the reproduction timing; A decoder 233 for decoding a bitstream of shape and image data input from the decoding buffer 232; And an object image synthesizer 234 constituting an image object to be displayed by synthesizing the shape and image data decoded by the decoder 233.

동영상 입력기(210)는 카메라 등으로부터 동영상을 입력받고, 형상 선택기(221)는 대화형 인터페이스(250)로부터 사용자의 선택내용을 수신하여 사용자의 선택에 따른 동영상의 형상, 위치, 해상도, 화질, 비트스트림 형식을 결정한다. 결정된 형상 및 영상 데이터는 부호화부(222)에서 부호화 한다. 다중 송신기(224)는 객체화된 형상 및 영상 데이터를 컨텐츠와 혼합하여 비트스트림으로 송신한다. 다중 수신기(231)는 전송된 비트스트림 정보를 객체 단위로 분리하고, 복호화부(233)는 복호화 버퍼(232)에 저장되는 객체 단위 데이터로부터 객체 선택, 객체의 위치, 해상도, 화질 등의 변경과 같은 요구에 대응된 각각의 객체로 복호화 한다. 객체 영상 합성기(234)는 복호화 된 각각의 객체 데이터를 합성하고, 디스플레이부(240)는 합성된 멀티미디어 영상을 보여준다.The video input unit 210 receives a video from a camera or the like, and the shape selector 221 receives a user's selection from the interactive interface 250 to shape, position, resolution, image quality, and bit of the video according to the user's selection. Determine the stream format. The determined shape and image data are encoded by the encoder 222. The multiple transmitter 224 mixes the objectified shape and image data with the content and transmits the content as a bitstream. The multi-receiver 231 separates the transmitted bitstream information into object units, and the decoder 233 changes object selection, object position, resolution, and quality from object data stored in the decoding buffer 232. Decode each object that corresponds to the same request. The object image synthesizer 234 synthesizes each decoded object data, and the display unit 240 displays the synthesized multimedia image.

이하에서 본 발명의 제2 실시 예의 다른 예에 따른 대화형 다 객체 화상전화/화상회의 단말기의 구성 요소와 동일한 부분에 대한 설명은 생략한다.Hereinafter, a description of the same parts as the components of the interactive multi-object videophone / videoconference terminal according to another example of the second embodiment of the present invention will be omitted.

본 발명의 제2 실시 예의 다른 예에 따른 대화형 다 객체 화상전화/화상회의 단말기는, 상대방 송신자에게 수신자의 선택 형상 및 수신자의 수신 환경을 알려 최적의 비트스트림 정보를 전달하는 업 스트리밍 송신기(260); 송신자의 요구를 파악하여 상대방 송신자로부터 최적의 비트스트림 정보를 전달받는 업 스트리밍 수신기(270)를 더 포함한다.The interactive multi-object video telephony / video conferencing terminal according to another example of the second embodiment of the present invention is an up-streaming transmitter 260 that informs a counterpart sender of a receiver's selection shape and a receiver's reception environment and delivers optimal bitstream information. ); The apparatus further includes an upstreaming receiver 270 that grasps a request of a sender and receives optimal bitstream information from a counterpart sender.

도8과 같이, 업스트리밍을 사용함으로써, 사용자는 업스트리밍 송신기(260)를 통해 상대 수신자에게 혼합된 영상 정보의 수신 선택 및 수신 환경을 알림으로써 이에 적합한 최적의 비트스트림 정보를 전달받도록 하며, 수신자는 업스트리밍 수신기(270)를 통해 수신자의 수신 환경에 따라 최적의 비트스트림 정보를 전달받는다.As shown in FIG. 8, by using upstreaming, the user can receive the optimal bitstream information suitable for the receiver by informing the counterpart receiver of the selection and reception environment of the mixed video information through the upstream transmitter 260. The upstream receiver 270 receives the optimal bitstream information according to the receiver's reception environment.

본 발명의 실시 예의 구성은 본 발명의 요지를 벗어나지 않는 범위 내에서 많은 변형 및 변경이 가능함은 물론이며, 본 발명이 실시 예에만 한정되는 것은 아니다.Of course, many modifications and variations are possible in the configuration of the embodiments of the present invention without departing from the gist of the present invention, and the present invention is not limited to the embodiments.

이상에서와 같이, 본 발명의 동영상 부호화를 이용한 객체기반 멀티미디어 서비스 시스템 및 서비스 방법은 사용자의 특정 관심 영역만을 실시간으로 부호화 및 복호화 할 수 있는 구조로 설계되어, 부호화 및 복호화 효율을 높이고, 동일한정보량으로 화질 및 해상도를 향상시킨 멀티미디어 컨텐츠를 사용자에게 제공할 수 있다. 특히, 서비스 제공자, 서비스 사용자, 화상 통신상의 송신자 또는 수신자가 대화형 인터페이스를 통해 형상 정보를 선택할 수 있도록 하는 형상 선택기를 사용함으로써, 서비스 환경에 최적인 컨텐츠를 제공할 수 있다.As described above, the object-based multimedia service system and service method using the video encoding of the present invention is designed in a structure that can encode and decode only a specific region of interest of the user in real time, thereby improving the encoding and decoding efficiency, the same amount of information It is possible to provide a user with multimedia content with improved image quality and resolution. In particular, by using a shape selector that allows a service provider, a service user, a sender or a receiver on a video communication to select shape information through an interactive interface, it is possible to provide content that is optimal for a service environment.

또한, 사용자가 대화형 인터페이스를 통해 제공되는 영상 컨텐츠의 수, 크기, 위치 등을 선택하고 조정할 수 있도록 한다.In addition, the user can select and adjust the number, size, position, etc. of the image content provided through the interactive interface.

Claims

In a multimedia service system supporting transmission of video data from a terminal of a service provider to a user terminal through a network,

An interactive interface for receiving a user's requirements to control output of one or more videos and allowing the user to select desired shape information; And

A shape selector which receives a result of determining shape information from the interactive interface and selects shape information according to the user's intention from the video data;

Object-based multimedia service system using video encoding comprising a.

In claim 1,

An encoder which encodes at least one shape selected by the shape selector and an image in the at least one selected shape;

An encoding buffer for storing bitstream data of the encoded shape and the image output from the encoder;

A multiplexing unit configured to reconstruct input bitstream data of the encoding buffer and transmit the same through the network;

A multiple receiver configured to receive and reconstruct bitstream data transmitted through the network;

A decoding buffer which stores the reconstructed bitstream from the multiple receivers in consideration of reproduction timing;

A decoder which decodes one or more shape and image data input from the decoding buffer;

A shape image synthesizer which synthesizes and reproduces the shape data and the image data.

Object-based multimedia service system using a video encoding further comprising.

The method of claim 1, wherein the shape selector

A default shape portion for selecting and using a predetermined specific default shape;

An image shape analyzer configured to analyze the input video;

A shape extraction unit which extracts and uses a shape of an object at the request of a user;

User-selected features that use one or more selection features that have been selected by the user or service provider.

Object-based multimedia service system using video encoding comprising a.

The method of claim 2, wherein the encoder

A shape encoder for encoding a shape selected by the shape selector;

An image encoder which obtains and encodes image data in the selection shape from the input video data by using the selection shape selected by the shape selector

The method of claim 2, wherein the decoding unit

A shape decoder for decoding a bitstream related to shape data input from the decoding buffer;

An image decoder for decoding a bitstream associated with the configuration information output from the shape decoder and the image data input from the decoding buffer via the multiple receiver.

In claim 2,

An upstream transmitter for receiving the user's shape selection request, content object selection, and receiving environment information from the interactive interface; And

Receives the shape selection request, content object selection, and receiving environment information transmitted from the upstream transmitter through the network, and obtains optimal shape and image data suitable for the result information through the shape selector, shape encoder, and image encoder. An upstream receiver that controls the multiple transmitters to transmit an optimal bitstream by encoding

In the service method using a multimedia service system that supports the transmission of video data from the terminal of the service provider to the user terminal through the network,

Receiving a request from a user from an interactive interface and causing the shape selector to select a shape according to the user request;

If the selected shape and the image in the shape are encoded, and the encoded information and the additional multimedia content information are multi-transmitted, the information having the selected shape and the additional multimedia content are decoded by receiving and decoding the information transmitted through the network. Synthesizing

Object-based multimedia service method using video encoding comprising a.

The method of claim 7, wherein the shape selection step

A method of using a selection shape selected by a user or a service provider from one or more selection shapes, a method of using a predetermined predetermined default shape, a method of using a selection shape determined through analysis of an input video, and extracting a shape Object-based multimedia service method using the video encoding, characterized in that it can be selectively used from the method using the extracted shape.

The method of claim 8, wherein the synthesis step

The upstream transmitter transmits the user's request, the user's selection, and the user's receiving environment through the network, and the upstream receiver identifies the needs of the other user, the user's selection, and the user's receiving environment. And causing the upstream transmitter to transmit an optimal bitstream.

The method of claim 9, wherein the synthesis step

An object-based multimedia service method using video encoding, characterized in that the object is processed with a change in size, position, and information link of an output multimedia object, together with one or more other multimedia objects under an environment set by the user's selection.

The method of claim 9, wherein the synthesis step

In addition to the main object to be provided to the user, at least one object or all objects selected by the user among the other multimedia objects that can be simultaneously provided are simultaneously played with the main object or according to the user's selection order,

An object-based multimedia service method using video encoding, which may use various types of information including an advertisement video, a still image, or 2D / 3D graphics instead of the main object.

The method of claim 9, wherein the synthesis step

If the user wants to play the full video of the general format, select and provide a bitstream of the first video object that can play the video of the general format,

If the user wants to play the internal region of the shape information selected by the shape selector at a high resolution, a scalable encoding bitstream for encoding the second image object, which is the corresponding selection region of the first image object, with a high resolution is used. Object-based multimedia service method using video encoding, characterized in that the.