KR101986995B1

KR101986995B1 - Media playback apparatus and method for synchronously reproducing video and audio on a web browser

Info

Publication number: KR101986995B1
Application number: KR1020170087661A
Authority: KR
Inventors: 박종찬; 정승원
Original assignee: 한화테크윈 주식회사
Priority date: 2017-01-20
Filing date: 2017-07-11
Publication date: 2019-09-30
Also published as: KR20180086112A; KR101942269B1; KR101942270B1; KR20180086113A; KR20180086115A; KR101931514B1; KR20180086114A

Abstract

본 발명은 웹 브라우저에서 비디오와 오디오를 동기화하여 재생하는 미디어 재생 장치 및 방법에 관한 발명이다. 본 발명에서는 오디오를 비디오가 출력되는 시점에 맞추어 출력되도록 버퍼링을 하여 비디오와 오디오의 출력이 동기화되도록 한다. 또한 본 발명의 다른 실시예에 따르면 비디오를 오디오가 출력되는 시점에 맞추어 비디오와 오디오를 동기화 시킬 수 있다.
본 발명에 의하면 비디오와 오디오가 서로 다른 디코더에 의해 디코딩 되더라도 동기화가 문제가 되지 않으므로, 비디오와 오디오를 분리하여 처리할 수 있는 환경을 제공한다. 따라서 넌 플러그인 환경에서 웹 브라우저에 임베드 된 디코더와 별도로 디코더를 구현하여 사용하는 것이 가능하며, 이를 통해 디코더에 따른 미디어의 코덱 형식에 대한 종속성이 감소한다.The present invention relates to a media playback device and method for synchronizing and playing video and audio in a web browser. In the present invention, the audio is buffered to be output at the time point at which the video is output, so that the output of the video and the audio is synchronized. In addition, according to another embodiment of the present invention it is possible to synchronize the video and the audio in accordance with the time the audio is output.
According to the present invention, even if video and audio are decoded by different decoders, synchronization does not become a problem, thereby providing an environment in which video and audio can be processed separately. Therefore, it is possible to implement and use the decoder separately from the decoder embedded in the web browser in the plug-in environment, thereby reducing the dependency on the codec format of the media according to the decoder.

Description

Media playback apparatus and method for synchronously reproducing video and audio on a web browser}

본 발명은 웹 브라우저에서 미디어를 재생하는 장치 및 방법에 관한 것으로서, 보다 상세하게는 비디오 또는 오디오를 컨테이너(container) 단위로 디코딩하는 경우 비디오와 오디오를 동기화하여 미디어를 재생하는 장치 및 방법에 관한 것이다.The present invention relates to an apparatus and method for playing media in a web browser, and more particularly, to an apparatus and method for playing media by synchronizing video and audio when decoding video or audio in a container unit. .

미디어 데이터를 사용자가 인터넷을 통해 웹 브라우저에서 재생하기 위해서, 코덱(code) 디코더(decoder) 및 렌더러(renderer)를 네이티브 코드(native code)로 작성한 플러그인을 이용하는 방법이 알려져 있다. 이러한 웹 브라우저 플러그인으로는 대표적으로 ActiveX와 NPAPI가 사용되고 있다.In order for a user to play media data in a web browser via the Internet, a method using a plug-in in which a code decoder and a renderer is written in native code is known. ActiveX and NPAPI are typically used as web browser plug-ins.

ActivieX는 마이크로소프트(MicroSoft)가 COM(Component Object Model)과 OLE(Object Linking and Embedding) 두 개의 기술을 합쳐서 개발한 것을 말한다. 하지만 대개 좁은 의미로 웹 브라우저인 인터넷 익스플로러(Internet Explorer)에서 애드온(add-on)의 형태로 사용되는 ActiveX Control을 말한다. ActiveX는 인터넷 익스플로러 웹 브라우저에서 미디어를 재생하는데 사용된다.ActivieX is a combination of two technologies, Microsoft's Component Object Model (COM) and Object Linking and Embedding (OLE). However, in a narrow sense, it refers to an ActiveX control that is used in the form of add-ons in Internet Explorer, a web browser. ActiveX is used to play media in Internet Explorer web browsers.

NPAPI는 Netscape 시절에 개발된 API로 인터넷 익스플로러의 ActivieX와 기능이 유사하다. 초기 웹 환경이 열악할 때, 웹 브라우저의 기능을 강화하기 위하여 외부 응용프로그램을 플러그인 형식으로 끌어 쓰기 위해 제공된 API이다. 즉, 초기 웹 페이지에서 음악이나 동영상을 재생하기 위해 개발된 것으로, 자바의 애플릿(Applet), 어도비(Adobe)의 플래시(Flash), 리얼 플레이어 등이 있다.NPAPI is an API that was developed in the Netscape era and is similar in function to Internet Explorer's ActivieX. When the initial web environment is poor, it is an API provided to draw external applications in the form of plug-ins to enhance the function of the web browser. In other words, it was developed to play music or video in an initial web page, such as Java's Applet, Adobe's Flash, and Real Player.

하지만 플러그인은 해커들이 악성 코드를 배포하는 통로로 사용되면서, 웹 브라우저 업체들이 더 이상 지원을 하지 않는 추세다. NPAPI의 경우 크롬(Chrome)을 제작하여 배포하는 구글에서 크롬 45 버전 이후로는 더 이상 NPAPI의 지원을 하고 있지 않으며, ActiveX의 경우 마이크로소프트에서 윈도우 10의 기본 브라우저로 사용되는 엣지(Edge) 버전부터는 더 이상 ActivieX의 지원을 하고 있지 않다.But plug-ins are used as a way for hackers to distribute malicious code, which is no longer supported by web browser vendors. NPAPI does not support NPAPI anymore since Google Chrome 45, which produces and distributes Chrome.In the case of ActiveX, the Edge version used by Microsoft as the default browser of Windows 10 is no longer supported. It no longer supports ActivieX.

플러그인 없이 웹 브라우저에서 미디어를 재생하기 위한 방법으로 HTML5의 비디오 태그를 이용하여 디코딩과 렌더링을 수행하는 방법이 있다. HTML5는 W3C와 웹 브라우저 개발사들에 의해 개발된 HTML의 5번째 마크업 언어로, 비디오 태그를 통해 플러그인 없이 미디어 재생이 가능한 환경을 제공한다. 크롬 4.0, 익스플로러 9.0, 파이어폭스 3.5, 사파리 4.0, 오페라 10.5 버전부터 HTML5의 비디오 태그를 전적으로 지원하고 있다.One way to play media in a web browser without a plug-in is to perform decoding and rendering using HTML5 video tags. HTML5 is the fifth markup language for HTML, developed by W3C and Web browser developers, and provides video playback without plug-ins. Starting with Chrome 4.0, Explorer 9.0, Firefox 3.5, Safari 4.0, and Opera 10.5, HTML5 video tags are fully supported.

비디오 태그를 이용하여 디코딩을 수행할 경우 고해상도 비디오와 높은 fps의 비디오를 우수한 성능으로 처리할 수 있다. 그런데 비디오 태그에서 지원하는 비디오 형식(video formats)에는 제한이 있으며 현재 MP4, WebM, 그리고 Ogg의 세가지의 형식만을 지원하고 있다. 보다 자세히는 MP4의 H.264, ACC, MP3, WebM의 VP8, VP9, Vorbis, Opus, 그리고 Ogg의 Theora, Vorbis 등의 코덱 형식을 지원하며, 비디오 코덱 형식인 H.265나 오디오 코덱 형식인 G.711, G.726 등은 비디오 태그에서 지원하고 있지 않다.When decoding using video tags, high resolution video and high fps video can be processed with good performance. However, video formats supported by video tags have limitations and currently only support three formats: MP4, WebM, and Ogg. More specifically, it supports codec formats such as H.264 for MP4, ACC, MP3, VP8, VP9 for WebM, Vorbis, Opus, and Theora, Vorbis for Ogg, and video codec format H.265 or audio codec format G. .711 and G.726 are not supported in the video tag.

비디오 태그의 코덱 형식 제한으로 인한 한계는 자바스크립트로 구현된 디코더를 사용할 경우 보완이 가능하다. 자바스크립트 디코더는 비디오 형식에 제한 없이 디코딩이 가능하므로 비디오 태그에서 지원하지 않는 코덱 형식의 미디어를 디코딩 하는 것이 가능하다. The limitations due to the codec format limitation of the video tag can be compensated for by using a decoder implemented in JavaScript. JavaScript decoders can decode unlimited video formats, so it is possible to decode media in codec formats that are not supported by video tags.

그러나 이러한 보완에도 여전히 한계가 존재한다. 비디오 태그는 기본적으로 복수의 프레임들을 패키징한 컨테이너(container) 단위로 비디오를 수신하여 디코딩을 수행하기 때문에 비디오 태그 디코더와 자바스크립트 디코더의 서로 다른 디코더를 이용하여 비디오와 오디오를 디코딩 할 경우 동기화가 문제될 수 있는 것이다. 예를 들어 H.264 형식의 비디오는 컨테이너 단위로 비디오 태그에 의해 디코딩되고 G.711 형식의 오디오는 자바스크립트 디코더로 디코딩되는 경우에 있어 컨테이너에 사용되는 프레임의 개수가 30이고 비디오의 fps가 1인 경우 최대 30초 후에 비디오가 출력되어 오디오가 먼저 재생될 수 있다. However, there are still limitations to this supplement. Since video tag basically receives and decodes video in a container unit that packs a plurality of frames, synchronization is problematic when decoding video and audio using different decoders of video tag decoder and JavaScript decoder. It can be. For example, if H.264 format video is decoded by video tag on a container basis and G.711 format audio is decoded by a JavaScript decoder, the number of frames used in a container is 30 and the video fps is 1 In this case, video may be output after a maximum of 30 seconds and audio may be played first.

이에 본 발명은 비디오와 오디오의 동기화 문제 없이 미디어 재생이 가능한 방법을 제안하고자 한다. Accordingly, the present invention is to propose a method that can play media without the problem of video and audio synchronization.

본 발명이 해결하고자 하는 과제는, 비디오가 컨테이너 단위로 디코딩되는 경우에 비디오와 오디오의 동기화가 문제되지 않는 미디어 재생 방법을 제안하는 것이다.SUMMARY OF THE INVENTION An object of the present invention is to propose a media playback method in which synchronization of video and audio is not a problem when video is decoded in container units.

본 발명의 과제들은 이상에서 언급한 과제로 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The objects of the present invention are not limited to the above-mentioned objects, and other objects that are not mentioned will be clearly understood by those skilled in the art from the following description.

상기 과제를 해결하기 위한 본 발명의 실시예에 따른 웹 브라우저 상에서 비디오와 오디오를 동기화하여 재생하는 미디어 재생 장치는, 미디어 서비스 장치에서 생성된 미디어 데이터를 웹 서비스를 지원하는 통신 프로토콜을 이용하여 수신하는 수신부; 상기 수신된 미디어 데이터를 비디오 데이터와 오디오 데이터로 분리하는 데이터 분리부; 상기 비디오 데이터를 구성하는 프레임들을 단위 개수로 패키징하여 청크 데이터로 변환하는 컨테이너 부; 웹 브라우저에 임베드 된(embedded) 디코더에 의해 상기 변환된 청크 데이터를 디코드하여 비디오를 복원하고, 상기 복원된 비디오를 출력할 때 상기 복원된 비디오가 출력되는 시점에 관한 타임 정보를 상기 청크 데이터 단위로 제공하는 미디어 복원부; 및 상기 청크 데이터 단위로 제공되는 타임 정보에 기초하여 상기 오디오 데이터를 상기 복원된 비디오와 동기화하여 출력하는 오디오 싱크부를 포함할 수 있다.In order to solve the above problems, a media player for synchronizing and playing video and audio on a web browser according to an embodiment of the present invention may receive media data generated by a media service apparatus using a communication protocol supporting a web service. Receiving unit; A data separator that separates the received media data into video data and audio data; A container unit for packaging the frames constituting the video data into unit numbers and converting the frames into chunk data; Reconstructs the video by decoding the converted chunk data by a decoder embedded in a web browser, and when the output of the reconstructed video is output, time information about a time point at which the reconstructed video is output, in units of the chunk data. A media restoration unit provided; And an audio sink configured to output the audio data in synchronization with the restored video based on time information provided in the chunk data unit.

상기 과제를 해결하기 위한 본 발명의 실시예에 따른 미디어 서비스 장치는, 미디어 데이터를 상기 미디어 재생 장치에 전송하는 미디어 서비스 장치로서, 상기 미디어 데이터를 상기 미디어 재생 장치의 웹 브라우저 상에서 재생하기 위해 필요한 스크립트 모듈을 저장하는 모듈 저장부; 상기 미디어 재생 장치의 접속에 응답하여 상기 스크립트 모듈을 상기 미디어 재생 장치에 전송하는 모듈 전송부; 상기 미디어 데이터를 패킷화하여 전송 패킷을 생성하는 패킷화부; 및 상기 재생 장치와의 통신 세션을 형성하며, 상기 미디어 재생 장치로부터의 요청에 따라 상기 전송 패킷을 상기 미디어 재생 장치에 전송하는 웹 서버를 포함하되, 상기 스크립트 모듈은, 상기 통신 패킷을 상기 통신 세션을 통해 수신하는 프로세스와, 상기 통신 패킷에 포함되는 비디오 프레임들을 단위 개수로 패키징하여 청크 데이터로 변환하는 프로세스와, 상기 변환된 청크 데이터가 상기 미디어 재생 장치에 설치된 미디어 복원부에 의해 디코드 되어 출력되는 시점에 관한 타임 정보에 기초하여, 상기 통신 패킷에 포함된 오디오 데이터를 상기 청크 데이터와 동기화하여 출력하는 프로세스를 포함할 수 있다.The media service apparatus according to the embodiment of the present invention for solving the above problems is a media service apparatus for transmitting media data to the media playback apparatus, the script required to play the media data on a web browser of the media playback apparatus. A module storage unit for storing a module; A module transmitter for transmitting the script module to the media player in response to a connection of the media player; A packetizer configured to packetize the media data to generate a transport packet; And a web server for establishing a communication session with the playback device, the web server for transmitting the transport packet to the media playback device in response to a request from the media playback device, wherein the script module is further configured to send the communication packet to the communication session. A process of receiving through a process, a process of packaging video frames included in the communication packet into unit numbers and converting the chunk data into chunk data, and the converted chunk data are decoded and output by a media restoring unit installed in the media player. The audio data included in the communication packet may be synchronized with the chunk data and output based on the time information about the viewpoint.

또한 상기 과제를 해결하기 위한 본 발명의 실시예에 따른 웹 브라우저 상에서 비디오와 오디오를 동기화하여 재생하는 미디어 재생 장치는, 미디어 서비스 장치에서 생성된 미디어 데이터를 웹 서비스를 지원하는 통신 프로토콜을 이용하여 수신하는 수신부; 상기 수신된 미디어 데이터로부터 제1 미디어 타입 데이터와 제2 미디어 타입 데이터로 분리하는 데이터 분리부; 상기 제1 미디어 데이터를 구성하는 프레임들을 단위 개수로 패키징하여 청크 데이터로 변환하는 컨테이너 부; 웹 브라우저에 임베드 된(embedded) 디코더에 의해 상기 변환된 청크 데이터를 디코드하여 제1 미디어를 복원하고, 상기 복원된 제1 미디어를 출력할 때 상기 복원된 제1 미디어가 출력되는 시점에 관한 타임 정보를 상기 청크 데이터 단위로 제공하는 미디어 복원부; 및 상기 청크 데이터 단위로 제공되는 타임 정보에 기초하여 상기 제2 미디어 데이터를 상기 복원된 제1 미디어와 동기화하여 출력하는 싱크부를 포함할 수 있다.In addition, the media playback apparatus for synchronizing and playing video and audio on a web browser according to an embodiment of the present invention for solving the above problems, receives the media data generated by the media service apparatus using a communication protocol supporting a web service Receiving unit; A data separator configured to separate first media type data and second media type data from the received media data; A container unit for packaging the frames constituting the first media data into unit numbers and converting the frames into chunk data; Decode the converted chunk data by a decoder embedded in a web browser to restore a first media, and time information regarding a time point at which the restored first media is output when the restored first media is output. A media restoring unit for providing a chunk data unit; And a sink unit configured to output the second media data in synchronization with the restored first media based on time information provided in the chunk data unit.

본 발명의 기타 구체적인 사항들은 설명 및 도면들에 포함되어 있다. Other specific details of the invention are included in the description and drawings.

본 발명의 실시예들에 의하면 적어도 다음과 같은 효과가 있다.According to embodiments of the present invention has at least the following effects.

상기 과제의 해결 수단에 의해 컨테이너 사양에 맞게 비디오와 오디오를 복원함으로써 동기화 된 미디어를 재생하는 것이 가능하다.By solving the above problem, it is possible to reproduce the synchronized media by restoring the video and the audio according to the container specification.

또한 상기 과제의 해결 수단에 의해 비디오와 오디오가 서로 다른 디코더에 의해 디코딩 되더라도 동기화가 문제가 되지 않으므로, 비디오와 오디오를 분리하여 처리할 수 있는 환경을 제공한다. In addition, even if video and audio are decoded by different decoders by the solution of the above problem, synchronization does not become a problem, thereby providing an environment in which video and audio can be processed separately.

또한 비디오 태그에서 지원하지 않는 코덱 형식의 오디오를 동기화 문제 없이 사용 가능하므로 코덱 형식에 대한 종속성이 감소한다.In addition, audio in codec formats that are not supported by video tags can be used without synchronization issues, thereby reducing the dependency on codec formats.

본 발명에 따른 효과는 이상에서 예시된 내용에 의해 제한되지 않으며, 더욱 다양한 효과들이 본 명세서 내에 포함되어 있다.The effects according to the present invention are not limited by the contents exemplified above, and more various effects are included in the present specification.

도 1은 미디어 재생을 위한 전체 시스템의 실시예를 나타낸다.
도 2는 기기들 간의 통신을 위해 계층적으로 정의되는 TCP/IP 4계층 모델을 보여주는 도면이다.
도 3은 미디어 서비스 장치(110)와 미디어 재생 장치(120)간의 웹소켓 연결이 이루어지는 과정을 나타낸다.
도 4는 웹소켓 연결을 통한 데이터의 송수신 과정의 일 예를 보여준다.
도 5는 네트워크 인터페이스(21)를 통해 전송되는 통신 패킷의 구조를 도시한 도면이다.
도 6은 미디어 서비스 장치(110)의 일 실시예를 나타낸다.
도 7은 미디어 서비스 장치(110)의 다른 실시예를 나타낸다.
도 8은 스크립트 모듈은 일 실시예를 나타낸다.
도 9는 스크립트 모듈의 다른 실시예를 나타낸다.
도 10은 스크립트 모듈의 또 다른 실시예를 나타낸다.
도 11은 스크립트 모듈의 또 다른 실시예를 나타낸다.
도 12는 미디어 재생 장치(120)의 일 실시예를 나타낸다.
도 13은 미디어 재생 장치(120)의 다른 실시예를 나타낸다.
도 14는 미디어 재생 장치(120)의 또 다른 실시예를 나타낸다.
도 15는 청크 데이터 생성과정을 나타낸다.
도 16은 오디오 청크 생성과정을 나타낸다.
도 17은 본 발명의 일 실시예에 따른 자바스크립트로 구현된 스크립트 모듈을 생성하는 과정을 설명하기 위한 예시도이다.
도 18은 미디어 재생 장치(120)를 구현하기 위한 컴퓨팅 장치(400)의 예시도이다.1 shows an embodiment of an overall system for media playback.
FIG. 2 is a diagram illustrating a TCP / IP four-layer model hierarchically defined for communication between devices.
3 illustrates a process of making a web socket connection between the media service apparatus 110 and the media playback apparatus 120.
4 shows an example of a process of transmitting and receiving data through a web socket connection.
5 is a diagram illustrating a structure of a communication packet transmitted through the network interface 21.
6 illustrates one embodiment of a media service device 110.
7 illustrates another embodiment of a media service device 110.
8 illustrates an embodiment of a script module.
9 illustrates another embodiment of a script module.
10 illustrates another embodiment of a script module.
11 shows another embodiment of a script module.
12 illustrates one embodiment of a media playback device 120.
13 shows another embodiment of a media playback device 120.
14 shows another embodiment of a media playback device 120.
15 shows a process of generating chunk data.
16 shows an audio chunk generation process.
17 is an exemplary diagram for describing a process of generating a script module implemented with JavaScript according to an embodiment of the present invention.
18 is an illustration of a computing device 400 for implementing a media playback device 120.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Advantages and features of the present invention and methods for achieving them will be apparent with reference to the embodiments described below in detail with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but can be implemented in various different forms, and only the embodiments make the disclosure of the present invention complete, and the general knowledge in the art to which the present invention belongs. It is provided to fully inform the person having the scope of the invention, which is defined only by the scope of the claims. Like reference numerals refer to like elements throughout.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms (including technical and scientific terms) used in the present specification may be used in a sense that can be commonly understood by those skilled in the art. In addition, the terms defined in the commonly used dictionaries are not ideally or excessively interpreted unless they are specifically defined clearly.

본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 및/또는 "포함하는(comprising)"은 언급된 구성요소 외에 하나 이상의 다른 구성요소의 존재 또는 추가를 배제하지 않는다.The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. In this specification, the singular also includes the plural unless specifically stated otherwise in the phrase. As used herein, "comprises" and / or "comprising" does not exclude the presence or addition of one or more other components in addition to the mentioned components.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 상세히 설명하기로 한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 미디어 재생을 위한 전체 시스템의 실시예를 나타낸다. 도 1의 전체 시스템은 미디어 서비스 장치(110)와 미디어 재생 장치(120) 및 두 장치를 연결하는 네트워크(430)로 이루어진다. 1 shows an embodiment of an overall system for media playback. The overall system of FIG. 1 is comprised of a media service device 110 and a media playback device 120 and a network 430 connecting the two devices.

미디어 서비스 장치(110)는 하나 이상의 영상 재생 장치들에 대해 컴퓨팅 서비스를 제공할 수 있도록 적합한 컴퓨팅 또는 프로세싱 장치를 포함한다. 예를 들면, 미디어 서비스 장치(110)는 네트워크 카메라, 네트워크 비디오 레코더(NVR), 디지털 비디오 레코더(DVR) 등과 같이 영상 스트림을 생성하거나 저장하며 사용자 기기들에 전송할 수 있는 장치를 포함한다. The media service device 110 includes a computing or processing device suitable for providing computing services for one or more video playback devices. For example, the media service apparatus 110 includes a device capable of generating or storing an image stream and transmitting the same to a user device, such as a network camera, a network video recorder (NVR), a digital video recorder (DVR), or the like.

미디어 재생 장치(120)는 네트워크를 통해 미디어 서비스 장치(110)나 다른 컴퓨팅 사용자 기기들과 상호작용할 수 있도록 적합한 컴퓨팅 또는 프로세싱 장치를 포함한다. 예를 들면, 미디어 재생 장치(120)는 데스크톱 컴퓨터, 모바일 전화 또는 스마트폰, PDA(personal digital assistant), 랩톱 컴퓨터 및 타블릿 컴퓨터를 포함할 수 있다. Media playback device 120 includes a computing or processing device suitable for interacting with media service device 110 or other computing user devices via a network. For example, media playback device 120 may include a desktop computer, mobile phone or smartphone, personal digital assistant (PDA), laptop computer, and tablet computer.

미디어 서비스 장치(110)에 의해 실시간 촬영 또는 저장된 미디어는 미디어 재생 장치(120)의 요청에 의해 네트워크(430)를 통해 전송된다. 사용자는 미디어 재생 장치(120)의 웹 브라우저(210)상에 구현된 사용자 인터페이스를 통해 전송된 미디어를 재생하는 것이 가능하다. 이때, 웹 브라우저(210)는 구글 크롬, 마이크로소프트 익스플로러, 모질라의 파이어폭스, 애플의 사파리 등, 데스크톱 컴퓨터나 모바일 전화에 탑재되는 일반적으로 알려진 브라우저를 포함할 뿐만 아니라, 웹 브라우저의 리소스나 API를 이용하여 별도로 제작된 소프트웨어 어플리케이션까지 포괄한다.Media captured or stored in real time by the media service device 110 is transmitted through the network 430 at the request of the media playback device 120. The user can play the transmitted media through a user interface implemented on the web browser 210 of the media playback device 120. In this case, the web browser 210 may include not only commonly known browsers mounted on desktop computers or mobile phones, such as Google Chrome, Microsoft Explorer, Mozilla Firefox, Apple Safari, etc. It also covers software applications produced separately.

이하 도 2 내지 도 5를 통해 미디어 서비스 장치(110)와 미디어 재생 장치(120)사이의 네트워크 통신 방식인 웹소켓 상에서 전송되는 RTSP/RTP 프로토콜을 설명한다.Hereinafter, the RTSP / RTP protocol transmitted on a web socket, which is a network communication method between the media service apparatus 110 and the media playback apparatus 120, will be described with reference to FIGS.

도 2는 기기들 간의 통신을 위해 계층적으로 정의되는 TCP/IP 4계층 모델을 보여주는 도면이다. FIG. 2 is a diagram illustrating a TCP / IP four-layer model hierarchically defined for communication between devices.

4계층은 네트워크 인터페이스 계층(21), 인터넷 계층(22), 전송 계층(23) 및 어플리케이션 계층(24)으로 이루어져 있다. 웹소켓 상에서 전송되는 RTSP/RTP 프로토콜에서 웹소켓 연결은 전송 계층(23) 연결의 상부에 위치하므로, 웹소켓 연결을 사용하기 위해서는, TCP 전송 연결이 미디어 서비스 장치(110)와 미디어 재생 장치(120) 사이에서 먼저 수립되어야 한다. 일단 웹소켓 연결이 예를 들어, 3-way 핸드쉐이크(hand-shake) 과정을 통해 미디어 서비스 장치(110)와 미디어 재생 장치(120)간에 수립되면, 웹소켓 통신은 웹소켓 패킷들을 전송함에 의해 수행된다. 웹소켓 연결 및 웹소켓 패킷은 이하 도 3 내지 도 5를 통해 자세히 설명한다.The four layers consist of a network interface layer 21, an internet layer 22, a transport layer 23, and an application layer 24. In the RTSP / RTP protocol transmitted on the websocket, the websocket connection is located on top of the transport layer 23 connection. Therefore, in order to use the websocket connection, the TCP transport connection is performed by the media service device 110 and the media playback device 120. Must be established first. Once the websocket connection is established between the media service device 110 and the media playback device 120 via, for example, a three-way handshake process, the websocket communication is performed by sending websocket packets. Is performed. The websocket connection and the websocket packet will be described in detail with reference to FIGS. 3 to 5 below.

도 3은 미디어 서비스 장치(110)와 미디어 재생 장치(120)간의 웹소켓 연결이 이루어지는 과정을 나타낸다. 3 illustrates a process of making a web socket connection between the media service apparatus 110 and the media playback apparatus 120.

미디어 재생 장치(120)는 미디어 서비스 장치(110)에 웹소켓 URI를 사용하여 웹소켓 연결을 개시하는 요청을 한다. 웹소켓 URI는 GetServiceCapabilities 명령을 사용하면 획득이 가능하다. 웹소켓 URI는 예를 들어 "ws://192.168.0.5/webSocketServer"와 같은 방식으로 표현된다(S1000). The media playback device 120 requests the media service device 110 to initiate a websocket connection using the websocket URI. The WebSocket URI can be obtained using the GetServiceCapabilities command. The web socket URI is expressed in the same manner as, for example, "ws: //192.168.0.5/webSocketServer" (S1000).

미디어 재생 장치(120)는 미디어 서비스 장치(110)에 웹소켓 업그레이드 요청을 한다. 미디어 서비스 장치(110)는 프로토콜 전환 요청에 대해 승인한다는 상태코드인 101 코드로 응답한다(S1100). The media playback device 120 makes a web socket upgrade request to the media service device 110. The media service apparatus 110 responds with a 101 code indicating a status code of acknowledging the protocol switch request (S1100).

미디어 서비스 장치(110)와 미디어 재생 장치(120)간에 웹소켓 연결이 이루어진 후부터는 HTTP/1.1 프로토콜이 아닌 웹소켓 상에서 전송되는 RTSP/RTP 프로토콜로 데이터를 주고받는다. 도 3의 DESCRIBE, SETUP, PLAY, PAUSE, TEARDOWN 은 RTSP의 명령어이다. DESCRIBE 요청에는 URL이 포함된다. DESCRIBE에 대한 응답 메시지에는 요청한 것에 대한 설명도 포함된다. SETUP 요청은 단일 미디어 스트림이 전송되어야 하는지 규정한다. PLAY 요청은 하나 또는 모든 미디어 스트림을 재생시키는 요청으로 다중 요청이 가능하다. PAUSE 요청은 하나 또는 모든 미디어 스트림에 대해서 일시 중지를 명령한다. PLAY 요청으로 다시 재시작 할 수 있다. TEARDOWN 요청은 세션을 종료하기 위한 명령어이다. TEARDOWN 요청에 의해 모든 미디어 스트림의 재생을 중단하고 관련 데이터에 걸린 모든 세션도 해제한다(S1200).After the websocket connection is established between the media service device 110 and the media playback device 120, data is exchanged using the RTSP / RTP protocol transmitted over the websocket, not the HTTP / 1.1 protocol. DESCRIBE, SETUP, PLAY, PAUSE, and TEARDOWN in FIG. 3 are RTSP commands. The DESCRIBE request includes a URL. The response message to DESCRIBE also contains a description of what was requested. The SETUP request specifies whether a single media stream should be sent. The PLAY request is a request to play one or all media streams. Multiple requests can be made. The PAUSE request commands a pause for one or all media streams. You can restart it with a PLAY request. The TEARDOWN request is a command to terminate a session. In response to the TEARDOWN request, playback of all media streams is stopped and all sessions associated with related data are released (S1200).

도 3에 도시된 웹소켓 연결 과정에서 미디어 재생 장치(120)에서 보내는 요청 메세지와 미디어 서비스 장치(110)의 응답 메세지의 예는 다음의 표 1에 기재된 바와 같다.Examples of the request message sent from the media playback device 120 and the response message of the media service device 110 in the websocket connection process shown in FIG. 3 are described in Table 1 below.

미디어 재생 장치(120) -> 미디어 서비스 장치(110)
GET /webSocketServer HTTP/1.1
Host: 192.168.0.1
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Origin: http://example.com
Sec-WebSocket-Protocol: rtsp.onvif.org
Sec-WebSocket-Version: 13.

미디어 서비스 장치(110) -> 미디어 재생 장치(120)
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Protocol: rtsp.onvif.orgMedia playback device 120-> Media service device 110
GET / webSocketServer HTTP / 1.1
Host: 192.168.0.1
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ ==
Origin: http://example.com
Sec-WebSocket-Protocol: rtsp.onvif.org
Sec-WebSocket-Version: 13.

Media service device 110-> media playback device 120
HTTP / 1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK + xOo =
Sec-WebSocket-Protocol: rtsp.onvif.org

이러한 웹소켓 연결은 HTML5 표준인 웹소켓 프로토콜에 따라 이루어진다. 특히, 웹소켓 연결은 양방향 통신을 지속적으로 지원하기 때문에, 미디어 서비스 장치(110)와 미디어 재생 장치(120)간에 연결이 끊기지 않고 계속적으로 데이터를 송수신할 수 있도록 할 수 있다.This websocket connection is made according to the websocket protocol, which is an HTML5 standard. In particular, since the websocket connection continuously supports two-way communication, it is possible to continuously transmit and receive data without disconnecting the connection between the media service device 110 and the media playback device 120.

도 4는 웹소켓 연결을 통한 데이터의 송수신 과정의 일 예를 보여준다. 4 shows an example of a process of transmitting and receiving data through a web socket connection.

도 4를 참조하면 먼저, 미디어 재생 장치(120)는 미디어 서비스 장치(110)에 TCP/IP 연결 요청 메시지(TCP/IP connection request)를 전송하고, 미디어 서비스 장치(110)가 이를 수락하여 TCP 응답 메시지(SYN-ACK)를 미디어 재생 장치(120)에 전송하면 TCP/IP 연결이 수립된다. TCP 전송 연결은 로컬 TCP 소켓 및 원격 TCP 소켓의 페어(pair)에 의해 형성될 수 있는데, 각각의 TCP 소켓은 적어도 포트 번호와 IP 주소와 같은 식별자에 의하여 정의된다. 물론, 이러한 TCP/IP 기반의 연결 대신에 양자간에 UDP/IP 기반의 연결을 수립할 수도 있다.Referring to FIG. 4, first, the media playback device 120 transmits a TCP / IP connection request message to the media service device 110, and the media service device 110 accepts the TCP / IP connection request. Sending a message SYN-ACK to the media playback device 120 establishes a TCP / IP connection. TCP transport connections can be formed by pairs of local and remote TCP sockets, each TCP socket being defined by at least an identifier such as a port number and an IP address. Of course, instead of such a TCP / IP-based connection may establish a UDP / IP-based connection between the two.

이 후, 미디어 재생 장치(120)와 미디어 서비스 장치(110) 간에 핸드쉐이크 과정을 통해 양자간에 웹소켓 연결이 수립되면, 그 이후에는 양자간의 지속적인 데이터의 송수신이 가능해진다. 즉, 미디어 재생 장치(120)는 미디어 서비스 장치(110)에 미디어 스트리밍 요청을 전송 웹소켓 패킷(socket.send)의 형태로 전송하고, 미디어 서비스 장치(110)는 미디어 재생 장치(120)에 미디어 스트림을 응답 웹소켓 패킷(socket.onMessage)의 형태로 전송한다. 이러한 과정은 미디어 스트림 전송이 중지되거나 완료될 때까지 지속적으로 양자간에 이루어질 수 있다.Thereafter, when the websocket connection is established between the media player 120 and the media service device 110 through a handshake process, it is possible to continuously transmit and receive data thereafter. That is, the media playback device 120 transmits a media streaming request to the media service device 110 in the form of a transmission web socket packet (socket.send), and the media service device 110 sends media to the media playback device 120. Send the stream in the form of a response WebSocket packet (socket.onMessage). This process can be continued between the two until the media stream transmission is stopped or completed.

도 5는 네트워크 인터페이스(21)를 통해 전송되는 통신 패킷의 구조를 도시한 도면이다. 5 is a diagram illustrating a structure of a communication packet transmitted through the network interface 21.

데이터(45)에 해당하는 RTP 페이로드에 RTP 헤더(44)가 부가 되면 RTP 패킷이 된다. RTP 패킷은 웹소켓 페이로드와 같으며 여기에 웹소켓 헤더(43)가 부가되어 웹소켓 패킷이 된다. 웹소켓 패킷은 TCP 페이로드와 같고 여기에 TCP 헤더(42)가 부가되어 TCP 패킷이 된다. 마지막으로 TCP 패킷은 IP 페이로드와 같고 여기에 IP 헤더(41)가 부가되면 최종적으로 통신 패킷, 즉 IP 패킷이 완성된다. 이와 같이 IP 패킷을 완성하는 과정 및 각각의 헤더를 제거하는 과정은 미디어 서비스 장치(110)와 미디어 재생 장치(120)에서 수행된다.When the RTP header 44 is added to the RTP payload corresponding to the data 45, it becomes an RTP packet. The RTP packet is the same as the websocket payload, and the websocket header 43 is added to the websocket packet. The websocket packet is the same as the TCP payload, and the TCP header 42 is added thereto to form a TCP packet. Finally, the TCP packet is equal to the IP payload, and when the IP header 41 is added thereto, the communication packet, that is, the IP packet is finally completed. As such, the process of completing the IP packet and the process of removing each header are performed by the media service apparatus 110 and the media playback apparatus 120.

미디어 서비스 장치(110)와 미디어 재생 장치(120)간의 통신은 이상 도 2 내지 도 5를 통해 설명한 HTML5 기반의 웹소켓 프로토콜을 통해 이루어지기 때문에, RTSP/RTP 송수신 제어를 담당하는 모듈은 HTML5에서 파싱이 가능한 스크립트 코드에 의해 구현될 수 있다. 일 예로 HTML5에서 파싱이 가능한 스크립트인 자바스크립트로 구현될 수 있다. 따라서, 종래와 같이 플러그인을 별도로 설치하지 않고도 RTSP/RTP 프로토콜을 이용한 미디어 재생이 HTML5 환경의 웹 브라우저 내에서 구현될 수 있다.Since the communication between the media service apparatus 110 and the media playback apparatus 120 is performed through the HTML5-based websocket protocol described above with reference to FIGS. 2 to 5, the module in charge of RTSP / RTP transmission / reception control is parsed in HTML5. This can be implemented by script code. For example, it can be implemented in JavaScript, a script that can be parsed in HTML5. Therefore, media playback using the RTSP / RTP protocol can be implemented in a web browser in an HTML5 environment without installing a plug-in as in the related art.

지금까지 미디어 서비스 장치(110)와 미디어 재생 장치(120)간의 네트워크 통신 방식을 살펴보았다. 이하 도 6 내지 도 18을 통해 미디어 서비스 장치(110)와 미디어 재생 장치(120)의 구성 및 작동 방식을 설명한다.So far, the network communication method between the media service apparatus 110 and the media playback apparatus 120 has been described. Hereinafter, the configuration and operation of the media service apparatus 110 and the media playback apparatus 120 will be described with reference to FIGS. 6 to 18.

도 6은 미디어 서비스 장치(110)의 일 실시예를 나타낸다. 일 실시예에서 미디어 서비스 장치(110)는 실시간 비디오 카메라(111), 인코더(112), 패킷화부(113), 웹 서버(114), 모듈 저장부(115), 모듈 전송부(116), 및 제어부(117)를 포함한다. 6 illustrates one embodiment of a media service device 110. In one embodiment, the media service apparatus 110 includes a real-time video camera 111, an encoder 112, a packetizer 113, a web server 114, a module storage 115, a module transmitter 116, and The control unit 117 is included.

실시간 비디오 카메라(111)는 미디어를 실시간으로 촬영하는 수단으로 촬영은 비디오 캡쳐 및 오디오 녹음을 모두 진행하는 방식과 비디오 캡쳐만 이루어지는 방식을 포함한다. The real-time video camera 111 is a means for capturing the media in real time. The recording includes a method of performing both video capture and audio recording and a method of performing only video capture.

인코더(112)는 실시간 비디오 카메라(111)에 의해 촬영된 미디어를 압축하여 부호화하는 기능을 수행한다. 인코더(112)에 의한 인코딩은 반드시 웹 브라우저에 임베드 된 디코더에서 지원하는 특정 코덱 형식으로 이루어져야 하는 것은 아니며 임의의 코덱 형식으로 인코딩 하는 것이 가능하다.The encoder 112 performs a function of compressing and encoding media photographed by the real time video camera 111. The encoding by the encoder 112 does not necessarily have to be in a specific codec format supported by the decoder embedded in the web browser, but may be encoded in any codec format.

패킷화부(113)는 인코딩된 미디어 데이터를 패킷화하여 전송 패킷을 생성한다. 패킷화란 미디어 데이터를 네트워크(430)를 통해 전송하는 것이 용이하도록 적당한 길이로 분할하거나 또는 미디어 데이터가 짧은 경우 적당한 길이로 일괄하여 그 각각에 수신되어야 할 주소 등의 제어정보를 부여하는 것을 뜻한다. 이 때 제어정보는 전송 패킷의 헤더에 위치하게 된다. 전송 패킷은 전술한 웹소켓 패킷의 형태로 이루어진다.The packetizer 113 generates a transport packet by packetizing the encoded media data. Packetization means dividing the media data into appropriate lengths for easy transmission through the network 430, or assigning control information such as an address to be received to each of the media data if the media data is short. At this time, the control information is located in the header of the transport packet. The transport packet is in the form of the above-described web socket packet.

패킷화부(113)는 미디어 재생 장치(120)에서 요청하는 방식에 따라 미디어 데이터의 패킷화를 수행할 수 있다. 예를 들어 미디어 재생 장치(120)에서 프레임 단위의 비디오를 요청한 경우 패킷화부(113)는 프레임 형식으로 전송 패킷을 생성하고, 미디어 재생 장치(120)에서 웹 브라우저에 임베드 된 디코더에서 지원하는 컨테이너 단위로 비디오를 요청한 경우 패킷화부(113)는 컨테이너 형식으로 전송 패킷을 생성할 수 있다. The packetizer 113 may packetize media data according to a method requested by the media reproducing apparatus 120. For example, when the media playback device 120 requests a frame-by-frame video, the packetizer 113 generates a transport packet in a frame format, and the media playback device 120 supports a container unit supported by a decoder embedded in a web browser. When the video request is received, the packetizer 113 may generate a transport packet in a container format.

웹 서버(114)는 미디어 재생 장치(120)와 통신 세션을 형성하는 기능을 수행한다. 즉, 미디어 서비스 장치(110)의 웹 서버(114)와 미디어 재생 장치(120) 간의 핸드쉐이크 과정을 통해 양자간에 웹소켓 연결이 수립된다. 이후 미디어 재생 장치(120)의 요청에 따라 패킷화부(113)에서 생성된 전송 패킷을 웹 서버(114)를 통해 전송하게 된다. The web server 114 performs a function of establishing a communication session with the media playback device 120. That is, a web socket connection is established between the web server 114 of the media service device 110 and the media playback device 120 through a handshake process. Thereafter, the transmission packet generated by the packetizer 113 is transmitted through the web server 114 at the request of the media playback apparatus 120.

모듈 저장부(115)는 미디어 재생 장치(120)에서 미디어를 재생하는데 필요한 스크립트 모듈을 저장하는 모듈이다. 스크립트 모듈은 웹 브라우저에서 파싱 할 수 있는 스크립트로 작성된 코드로 미디어 재생 장치(120)에서 플러그인이나 별도의 응용프로그램 설치 없이 HTML5환경의 웹 브라우저에서 미디어 재생이 가능하도록 해주는 모듈이다. 스크립트 모듈은 일 예로 자바스크립트로 작성된 코드일 수 있다. 스크립트 모듈에 대해서는 도 8 내지 도 11을 참조하여 후술한다.The module storage unit 115 is a module that stores a script module for playing media in the media playback device 120. The script module is a code written as a script that can be parsed by a web browser, and is a module that enables media playback in a web browser in an HTML5 environment without installing a plug-in or a separate application in the media playback device 120. The script module may be, for example, code written in JavaScript. The script module will be described later with reference to FIGS. 8 to 11.

모듈 전송부(116)는 모듈 저장부(115)에 저장된 스크립트 모듈을 미디어 재생 장치(120)로 전송하는 모듈이다. 모듈 전송부(116)는 미디어 재생 장치(120)가 네트워크(430)를 통해 미디어 서비스 장치(110)에 접속한 경우 이에 응답하여 스크립트 모듈을 전송한다. The module transmitter 116 is a module that transmits the script module stored in the module storage 115 to the media player 120. The module transmitter 116 transmits the script module in response to the media playback device 120 accessing the media service device 110 through the network 430.

제어부(117)는 미디어 서비스 장치(110)의 다른 구성 모듈들을 제어하는 모듈이다. 예를 들어 네트워크(430)를 통해 웹 서버(114)에 미디어 재생 장치(120)의 접속이 이루어진 경우 모듈 저장부(115)에 저장된 스크립트 모듈이 모듈 전송부(116)를 통해 미디어 재생 장치(120)로 전송되는데, 제어부(117)는 각 모듈들로부터 신호를 주고 받아 해당 작업이 원활이 이루어지게끔 통제하는 역할을 수행한다.The controller 117 is a module that controls other configuration modules of the media service apparatus 110. For example, when the media playback device 120 is connected to the web server 114 via the network 430, the script module stored in the module storage 115 is transferred to the media playback device 120 through the module transmission unit 116. The control unit 117 transmits and receives a signal from each module to control the operation to be performed smoothly.

이상의 도 6의 미디어 서비스 장치(110)의 구성 모듈들에 대한 설명을 바탕으로 작동 방식을 살펴본다. 네트워크(430)를 통해 웹 서버(114)에 미디어 재생 장치(120)의 접속이 이루어지면 모듈 전송부(116)는 모듈 저장부(115)에 저장된 스크립트 모듈을 미디어 재생 장치(120)로 전송한다. 미디어 재생 장치(120)에 스크립트 모듈이 설치되면 사용자는 사용자 인터페이스를 통해 미디어 재생을 요청한다. 이에 미디어 서비스 장치(110)는 실시간 비디오 카메라(111)에서 촬영되는 실시간 라이브 미디어를 인코더(112)에서 인코딩하고 패킷화부(113)에서 프레임 또는 컨테이너 형식에 맞춰 미디어 데이터를 전송 패킷으로 패킷화하여 웹 서버(114)를 통해 미디어 재생 장치(120)로 전송한다.An operation method will be described based on the description of the configuration modules of the media service apparatus 110 of FIG. 6. When the media player 120 is connected to the web server 114 via the network 430, the module transmitter 116 transmits the script module stored in the module storage 115 to the media player 120. . When the script module is installed in the media player 120, the user requests media playback through the user interface. Accordingly, the media service apparatus 110 encodes real-time live media captured by the real-time video camera 111 in the encoder 112 and packetizes the media data into a transport packet in accordance with a frame or container format by the packetizer 113 to form a web. The server 114 transmits the data to the media playback device 120.

도 7은 미디어 서비스 장치(110)의 다른 실시예를 나타낸다. 도 6의 미디어 서비스 장치(110)는 실시간 비디오 카메라(111)에 의해 실시간 라이브 미디어를 전송하기 위한 실시예이고, 도 7의 미디어 서비스 장치(110)는 미디어 저장부(118)에 저장된 미디어 데이터를 전송하기 위한 실시예를 나타낸다. 7 illustrates another embodiment of a media service device 110. The media service apparatus 110 of FIG. 6 is an embodiment for transmitting real-time live media by the real time video camera 111, and the media service apparatus 110 of FIG. 7 stores media data stored in the media storage unit 118. An embodiment for transmitting is shown.

미디어 저장부(118)는 네트워크 비디오 레코더(NVR) 또는 개인용 녹화기(PVR)를 포함하나 도 7의 실시예에서는 네트워크 비디오 레코더인 경우를 예로 들어 설명한다. Although the media storage unit 118 includes a network video recorder (NVR) or a personal recorder (PVR), the embodiment of FIG. 7 will be described taking the case of a network video recorder as an example.

미디어 저장부(118)는 카메라 또는 서버로부터 미디어 데이터를 전송받아 압축 저장한다. 미디어 재생 장치(120)로부터 미디어 데이터 전송 요청이 있는 경우 미디어 서비스 장치(110)는 미디어 저장부(118)에 저장된 미디어 데이터를 패킷화부(113)에서 패킷화하여 웹 서버(114)를 통해 전송한다. The media storage unit 118 receives and compresses media data from a camera or a server and stores the compressed data. When the media data transmission request is received from the media player 120, the media service device 110 packetizes the media data stored in the media storage unit 118 in the packetizer 113 and transmits the packetized data through the web server 114. .

도 7의 미디어 서비스 장치(110) 구성모듈 중 패킷화부(113), 웹 서버(114), 모듈 저장부(115), 모듈 전송부(116), 제어부(117)는 도 6의 실시예에서 설명한 모듈들로 중복되는 설명은 생략한다.The packetizing unit 113, the web server 114, the module storage unit 115, the module transmission unit 116, and the control unit 117 of the media service apparatus 110 configuration module of FIG. 7 are described in the embodiment of FIG. 6. Overlapping descriptions of modules are omitted.

도 8 내지 도 11은 미디어 서비스 장치(110)로부터 미디어 재생 장치(120)로 전송되는 스크립트 모듈의 실시예들을 나타낸다. 스크립트 모듈은 웹 브라우저에서 파싱할 수 있는 스크립트로 구현될 수 있으며 도 8 내지 도 11의 실시예에서는 자바스크립트로 구현되는 경우에 대해 설명한다. 8 through 11 illustrate embodiments of a script module transmitted from the media service apparatus 110 to the media playback apparatus 120. The script module may be implemented as a script that can be parsed by a web browser, and the embodiments of FIGS. 8 to 11 describe a case where the script module is implemented as JavaScript.

도 8은 스크립트 모듈은 일 실시예를 나타낸다. 도 8의 스크립트 모듈은 프레임 단위로 비디오를 복원하기 위한 모듈로, RTSP/RTP 클라이언트 모듈(121), 디패킷화 모듈(122), JS 디코더 모듈(124), 및 JS 렌더러 모듈(125)을 포함한다.8 illustrates an embodiment of a script module. The script module of FIG. 8 is a module for restoring video on a frame basis and includes an RTSP / RTP client module 121, a depacketization module 122, a JS decoder module 124, and a JS renderer module 125. do.

RTSP/RTP 클라이언트 모듈(121)은 미디어 서비스 장치(110)와 RTSP/RTP 통신을 지원하는 기능을 수행한다. RTSP/RTP 클라이언트 모듈(121)을 통해 미디어 서비스 장치(110)의 웹 서버(114)로부터 전송 패킷을 수신할 수 있다. 현재로서 플러그인 없이 웹 브라우저 상에서 RTSP/RTP에 따라 미디어를 처리하는 것이 가능하지 않은데, RTSP/RTP 클라이언트 모듈(121)을 이용하면 웹 브라우저에서 HTTP 방식을 이용함에도 안정적으로 RTSP/RTP 프로토콜로 전송된 데이터를 수신할 수 있다.The RTSP / RTP client module 121 performs a function of supporting RTSP / RTP communication with the media service device 110. The transport packet may be received from the web server 114 of the media service apparatus 110 through the RTSP / RTP client module 121. At present, it is not possible to process media according to RTSP / RTP on a web browser without a plug-in. If the RTSP / RTP client module 121 is used, the data transmitted in the RTSP / RTP protocol is stably transmitted even though the web browser uses the HTTP method. Can be received.

디패킷화 모듈(122)은 RTSP/RTP 클라이언트 모듈(121)로부터 전달받은 패킷을 디패킷화(depacketization)하는 모듈이다. 디패킷화란 패킷화의 반대되는 개념으로 패킷화가 미디어 데이터를 적당한 길이로 분할하여 패킷을 형성하는 것을 의미한다면 이를 다시 이어 붙여 패킷화 이전의 상태로 미디어 데이터를 복원해주는 것을 뜻한다.The depacketization module 122 is a module for depacketizing a packet received from the RTSP / RTP client module 121. Depacketization is the opposite concept of packetization. If packetization means that media data is formed by dividing the media data into proper lengths, the packet is re-connected to restore the media data to the state before packetization.

JS 디코더 모듈(124)은 인코딩된 비디오의 압축 및 부호화를 해제하는 것, 즉 디코딩을 수행하는 모듈이다. JS 디코더 모듈(124)은 스크립트 모듈의 다른 모듈들과 마찬가지로 자바스크립트로 구현될 수 있다. JS 디코더 모듈(124)을 자바스크립트로 구현하기 때문에 웹 브라우저에 임베드 된 디코더와 달리 코덱 형식에 제한 없이 임의의 코덱 형식에 대하여 디코딩이 가능하다. 또한 프레임 단위로 디코딩이 가능하다.The JS decoder module 124 is a module that decompresses, or decodes, the encoded video. The JS decoder module 124 may be implemented in JavaScript like other modules of the script module. Since the JS decoder module 124 is implemented in JavaScript, it is possible to decode any codec format without restriction on the codec format unlike the decoder embedded in the web browser. It can also be decoded frame by frame.

도 8의 일 실시예에 따라 JS 디코더 모듈(124)을 자바스크립트로 구현하면 예를 들어 다음의 표 2 및 표 3과 같은 코드로 표현될 수 있다(표 3은 표 2로부터 이어지는 내용이다).If the JS decoder module 124 is implemented in JavaScript according to the embodiment of FIG. 8, for example, the JS decoder module 124 may be represented by code such as Table 2 and Table 3 below (Table 3 is the content following Table 2).

if (!(itemId in self._requestContext.dependencies)) {
if (width > payload.displayWidth) {
payload.displayWidth = width;
}
payload.frames.push({
canvasFrameData: displayImageData.data,
itemId: itemId,
width: width,
height: height
});
}
self._requestContext.currentFrameIndex++;
if (self._requestContext.currentFrameIndex >= self._requestContext.itemIds.length) {
self._requestContext.callback(payload);
self._isRequestActive = false;
self.decode(); // Decode next queued request
}
});
};

...

this._createDecodeCanvas(document.documentElement);
this._reset();
}
if (! (itemId in self._requestContext.dependencies)) {
if (width> payload.displayWidth) {
payload.displayWidth = width;
}
payload.frames.push ({
canvasFrameData: displayImageData.data,
itemId: itemId,
width: width,
height: height
});
}
self._requestContext.currentFrameIndex ++;
if (self._requestContext.currentFrameIndex> = self._requestContext.itemIds.length) {
self._requestContext.callback (payload);
self._isRequestActive = false;
self.decode (); // Decode next queued request
}
});
};

...

this._createDecodeCanvas (document.documentElement);
this._reset ();
}

JS 렌더러 모듈(125)은 JS 디코더 모듈(124)에서 디코딩 된 비디오를 렌더링하여 모니터 등과 같은 출력 기기에 디스플레이 되도록 하는 기능을 수행한다. JS 렌더러 모듈(125)은 WebGL을 이용하며 YUV 형식의 비디오를 RGB 형식으로 변환시킬 수 있다. WebGL이란 자바스크립트를 통해 사용이 가능하고 3D 그래픽 인터페이스를 만들 수 있게 해주는 웹 기반의 그래픽 라이브러리이다.The JS renderer module 125 renders the video decoded by the JS decoder module 124 to be displayed on an output device such as a monitor. The JS renderer module 125 uses WebGL and can convert YUV format video to RGB format. WebGL is a web-based graphics library that lets you create 3D graphical interfaces that can be used via JavaScript.

도 9는 스크립트 모듈의 다른 실시예를 나타낸다. 도 9의 스크립트 모듈은 컨테이너 형식으로 비디오를 복원하기 위해 사용되는 모듈로, RTSP/RTP 클라이언트 모듈(121), 디패킷화 모듈(122), 및 컨테이너 생성 모듈(127)을 포함한다. RTSP/RTP 클라이언트 모듈(121) 및 디패킷화 모듈(122)은 각각 도 8에서 설명한 모듈들로 중복되는 설명은 생략한다. 9 illustrates another embodiment of a script module. The script module of FIG. 9 is a module used to restore video in a container format, and includes an RTSP / RTP client module 121, a depacketizing module 122, and a container generating module 127. The RTSP / RTP client module 121 and the depacketization module 122 will not be described repeatedly with the modules described with reference to FIG. 8.

도 9의 스크립트 모듈을 자세히 살펴보면 도 8의 경우와 다르게 컨테이너 생성 모듈(127)이 포함된 것을 볼 수 있다. 컨테이너 생성 모듈(127)은 비디오가 컨테이너 단위로 패키징이 이루어져 있지 않은 경우 프레임들을 단위 개수로 패키징하여 컨테이너를 형성하는 기능을 수행할 수 있다. 또한 비디오 프레임들과 오디오를 단위 개수로 패키징할 수 있다. 이때 컨테이너 생성 모듈(127)은 비디오의 fps(frames per second)에 따라 단위 개수를 가변적으로 조절할 수 있다. 컨테이너 생성 모듈(127)에서 비디오 프레임들을 단위 개수로 패키징 하거나 비디오 프레임들과 오디오를 단위 개수로 패키징 한 것을 청크 데이터라 칭하기로 한다.Looking at the script module of FIG. 9 in detail, it can be seen that the container generation module 127 is included unlike the case of FIG. 8. When the video is not packaged in a container unit, the container generation module 127 may perform a function of packaging a frame by a unit number to form a container. In addition, video frames and audio may be packaged in unit numbers. In this case, the container generation module 127 may variably adjust the number of units according to frames per second (fps) of the video. In the container generation module 127, packaging of video frames by a unit number or packaging of video frames and audio by a unit number will be referred to as chunk data.

컨테이너는 MPEG-DASH 컨테이너 등 비디오 태그에서 지원하는 컨테이너를 포함하는 개념이다. 컨테이너 생성 모듈(127)은 HTML5 비디오 태그와 호환되는 컨테이너 형식으로 미디어를 구성해줄 수 있기 때문에 영상 촬영 장치에서 컨테이너 형식으로 미디어를 전송하지 않더라도 호환성 문제 없이 미디어 재생 장치(120)에서 비디오 태그의 사용이 가능하도록 해준다. 즉, 기존에 설치된 영상 촬영 장치의 수정 없이도 비디오 태그의 사용이 가능한 환경을 제공한다. A container is a concept including a container supported by a video tag such as an MPEG-DASH container. Since the container generation module 127 can organize the media in a container format that is compatible with HTML5 video tags, the use of the video tag in the media playback device 120 can be easily performed without the compatibility problem even if the image capturing device does not transmit the media in the container format. To make it possible. In other words, it provides an environment in which a video tag can be used without modifying an existing image capturing apparatus.

도 10은 스크립트 모듈의 또 다른 실시예를 나타낸다. 도 10의 스크립트 모듈은 오디오 재생을 위한 모듈로 RTSP/RTP 클라이언트 모듈(121), 디패킷화 모듈(122), 및 오디오 트랜스코더(123)를 포함한다. RTSP/RTP 클라이언트 모듈(121) 및 디패킷화 모듈(122)은 각각 도 8에서 설명한 모듈들로 중복되는 설명은 생략한다. 10 illustrates another embodiment of a script module. The script module of FIG. 10 includes a RTSP / RTP client module 121, a depacketization module 122, and an audio transcoder 123 as a module for audio reproduction. The RTSP / RTP client module 121 and the depacketization module 122 will not be described repeatedly with the modules described with reference to FIG. 8.

오디오 트랜스코더(123)는 오디오 데이터가 웹 브라우저에 임베드 된 디코더에서 지원하지 않는 코덱 형식으로 구성된 경우 트랜스코딩을 수행하는 모듈이다. 트랜스코딩이란 코덱 형식을 변환하는 것으로 오디오 트랜스코더(123)에서는 오디오 데이터의 압축을 풀고 이전과 다른 코덱 형식으로 다시 압축을 수행한다. 예를 들어 웹 브라우저에 임베드 된 디코더인 비디오 태그에서 지원하지 않는 G.711 코덱 형식의 오디오 데이터를 비디오 태그에서 지원하는 ACC의 코덱 형식으로 변환하는 경우이다.The audio transcoder 123 is a module that performs transcoding when audio data is configured in a codec format not supported by a decoder embedded in a web browser. Transcoding is a conversion of a codec format. The audio transcoder 123 decompresses audio data and performs compression again in a codec format different from the previous one. For example, it converts audio data of G.711 codec format which is not supported by video tag which is a decoder embedded in web browser into codec format of ACC supported by video tag.

오디오 트랜스코더(123)는 오디오 데이터를 웹 브라우저에서 임베드 된 디코더에서 지원하는 코덱 형식으로 변환하여 주므로, 오디오 데이터를 비디오 데이터와 함께 웹 브라우저에서 임베드 된 디코더를 이용하여 복원하여 동기화 문제 없이 출력할 수 있는 환경을 제공한다. Since the audio transcoder 123 converts the audio data into a codec format supported by the decoder embedded in the web browser, the audio transcoder 123 can restore the audio data together with the video data using the decoder embedded in the web browser and output the same without a synchronization problem. Provide an environment that is

도 11은 스크립트 모듈의 또 다른 실시예를 나타낸다. 도 11의 스크립트 모듈은 도 10의 스크립트 모듈과 마찬가지로 오디오 재생을 위한 모듈이나 구성이 다르다. 도 11의 스크립트 모듈은 RTSP/RTP 클라이언트 모듈(121), 디패킷화 모듈(122), 오디오 디코더(126), 오디오 청크부(128) 및 버퍼 컨트롤러(129)를 포함하며 RTSP/RTP 클라이언트 모듈(121) 및 디패킷화 모듈(122)은 도 8에서 설명한 모듈들이다. 11 shows another embodiment of a script module. The script module of FIG. 11 has a different module or configuration for audio reproduction similar to the script module of FIG. 10. The script module of FIG. 11 includes an RTSP / RTP client module 121, a depacketizing module 122, an audio decoder 126, an audio chunk unit 128, and a buffer controller 129. 121 and the depacketization module 122 are the modules described with reference to FIG. 8.

오디오 디코더(126)는 오디오 데이터의 디코딩을 수행하는 모듈로 다른 모듈들과 마찬가지로 웹 브라우저에서 파싱할 수 있는 스크립트인 자바스크립트로 구현될 수 있다.The audio decoder 126 is a module that performs decoding of audio data. Like other modules, the audio decoder 126 may be implemented in JavaScript, which is a script that can be parsed by a web browser.

오디오 디코더(126)에서 디코딩된 오디오 데이터는 오디오 청크부(128)에서 오디오 청크로 패키징된다. 이때, 오디오 청크는 컨테이너 생성 모듈(127)에서 생성되는 청크 데이터를 구성하는 프레임들의 단위 개수로 오디오 데이터를 패키징하여 생성된다. 즉, 청크 데이터에 종속되어 오디오 청크가 생성된다. 도 12에는 컨테이너 생성 모듈(127)과 오디오 청크부(128)가 모두 포함된 미디어 재생 장치(120)가 도시되어 있는데, 도 12에서 오디오 청크를 생성하는 것에 대해 다시 설명한다. The audio data decoded by the audio decoder 126 is packaged into an audio chunk in the audio chunk 128. In this case, the audio chunk is generated by packaging the audio data in the unit number of frames constituting the chunk data generated by the container generation module 127. That is, the audio chunk is generated in dependence on the chunk data. FIG. 12 illustrates a media player 120 including both the container generation module 127 and the audio chunk unit 128. The audio chunk generation in FIG. 12 will be described again.

버퍼 컨트롤러(129)는 오디오 청크부(128)로부터 오디오 청크를 전달받아 오디오 버퍼에 버퍼링하고, 버퍼링된 오디오 청크를 오디오 렌더러(136)에 제공한다. 오디오 렌더러(136)는 미디어 재생 장치(120)의 구성 모듈로 도 12 및 도 13에 도시되어 있다. The buffer controller 129 receives the audio chunk from the audio chunk 128 and buffers the audio chunk in the audio buffer, and provides the buffered audio chunk to the audio renderer 136. The audio renderer 136 is shown in FIGS. 12 and 13 as a configuration module of the media playback device 120.

버퍼 컨트롤러(129)는 비디오를 복원하는 다른 모듈로부터 타임 정보를 받아 오디오를 비디오에 동기화시키는 기능을 수행한다. 상기 타임 정보는 청크 데이터가 미디어 재생 장치(120)에 설치된 미디어 복원부(143)에 의해 디코드 되어 출력되는 시점에 관한 정보로, 청크 데이터 중에서 시작 부분이 미디어 재생 장치 내(120)의 렌더러에 의해 렌더링하는 시점을 나타낸다. 버퍼 컨트롤러(129)는 상기 타임 정보를 통해 비디오가 디스플레이되는 시점에 맞추어 오디오가 출력될 수 있게 오디오 청크를 버퍼링하거나 오디오 렌더러(136)에 전달함으로써 비디오와 오디오를 동기화한다.The buffer controller 129 receives time information from another module for restoring the video and synchronizes the audio with the video. The time information is information about a time point at which the chunk data is decoded and output by the media restoring unit 143 installed in the media playback device 120. The start portion of the chunk data is determined by a renderer in the media playback device 120. Indicates the time to render. The buffer controller 129 synchronizes the video with the audio by buffering the audio chunk or delivering the audio chunk to the audio renderer 136 so that the audio is output at the time when the video is displayed through the time information.

도 11의 스크립트 모듈은 그 순서를 달리하여 구성하는 것이 가능한데, 오디오 디코더(126)를 버퍼 컨트롤러(129)의 후단으로 이동시키는 것이다. 이 경우 버퍼 컨트롤러(129)에 의해 청크 데이터와 오디오 데이터의 동기화가 먼저 수행되고 버퍼링된 오디오 데이터를 오디오 디코더(126)가 디코드하게 된다.The script module of FIG. 11 can be configured in a different order, which moves the audio decoder 126 to the rear end of the buffer controller 129. In this case, synchronization of the chunk data and the audio data is first performed by the buffer controller 129, and the audio decoder 126 decodes the buffered audio data.

이상 도 8 내지 도 11에서 설명한 스크립트 모듈은 미디어 재생 장치(120)가 미디어 서비스 장치(110)에 접속함에 따라 미디어 서비스 장치(110)에서 미디어 재생 장치(120)로 전송되어, 미디어 재생 장치(120)의 웹 브라우저(210)에서 플러그인 없이 미디어 재생이 가능한 환경을 제공한다. 즉, 스크립트 모듈은 미디어 재생 장치(120)에 설치되어 미디어 재생을 위한 시스템을 구성하게 된다. 스크립트 모듈이 설치된 미디어 재생 장치(120)의 실시예는 이하 도 12 내지 도 14를 통해 살펴본다.The script module described with reference to FIGS. 8 to 11 is transmitted from the media service apparatus 110 to the media playback apparatus 120 as the media playback apparatus 120 connects to the media service apparatus 110, and thus the media playback apparatus 120. In the web browser 210 of) provides a environment that enables media playback without a plug-in. That is, the script module is installed in the media playback device 120 to configure a system for media playback. An embodiment of the media player 120 having the script module installed will be described below with reference to FIGS. 12 to 14.

도 12 내지 도 14는 미디어 재생 장치(120)의 실시예들을 도시한다. 미디어 재생 장치(120)에서 미디어 재생을 위한 주요 모듈들은 스크립트 모듈에 의해 구성된다. 도 8 내지 도 11을 통해 스크립트 모듈의 기능을 살펴보았으므로 도 12 내지 도 14에서는 구조 및 작동방식을 중심으로 미디어 재생 장치(120)를 설명한다.12-14 illustrate embodiments of media playback device 120. The main modules for media playback in the media playback device 120 are constituted by a script module. Since the functions of the script module have been described with reference to FIGS. 8 to 11, FIGS. 12 to 14 illustrate the media playback apparatus 120 based on the structure and operation method.

도 12는 미디어 재생 장치(120)의 일 실시예를 나타낸다. 도 12의 미디어 재생 장치(120)는 수신부(141), 데이터 분리부, 컨테이너 부(142), 미디어 복원부(143) 및 오디오 싱크부(144)를 포함한다. 이때, 수신부(141), 데이터 분리부, 컨테이너 부(142) 및 오디오 싱크부(144)는 자바스크립트로 구현될 수 있다. 도 12의 미디어 재생 장치(120)는 도 9 및 도 11의 스크립트 모듈을 수신하여 구성될 수 있다.12 illustrates one embodiment of a media playback device 120. The media player 120 of FIG. 12 includes a receiver 141, a data separator, a container 142, a media reconstructor 143, and an audio sink 144. In this case, the receiver 141, the data separator, the container 142, and the audio sink 144 may be implemented in JavaScript. The media player 120 of FIG. 12 may be configured to receive the script modules of FIGS. 9 and 11.

수신부(141)는 미디어 서비스 장치(110)에서 생성된 미디어 데이터를 웹 서비스를 지원하는 통신 프로토콜을 이용하여 수신한다. 이때 웹 서비스를 지원하는 통신 프로토콜은 웹소켓 상에서 전송되는 RTSP/RTP 프로토콜일 수 있다. 수신부(141)는 웹소켓 클라이언트(131)와 RTSP/RTP 클라이언트 모듈(121)을 포함한다. The receiver 141 receives media data generated by the media service apparatus 110 using a communication protocol supporting a web service. In this case, the communication protocol supporting the web service may be an RTSP / RTP protocol transmitted on the web socket. The receiver 141 includes a web socket client 131 and an RTSP / RTP client module 121.

웹소켓 클라이언트(131)는 미디어 서비스 장치(110)의 웹 서버(114)와 웹소켓 연결을 수립하는 모듈이다. 미디어 재생 장치(120)와 미디어 서비스 장치(110)는 각각 웹소켓 클라이언트(131)와 웹 서버(114)간의 핸드쉐이크를 거쳐 전송 패킷을 주고 받는다.The websocket client 131 is a module for establishing a websocket connection with the web server 114 of the media service device 110. The media playback device 120 and the media service device 110 exchange transport packets through a handshake between the websocket client 131 and the web server 114, respectively.

RTSP/RTP 클라이언트 모듈(121)은 도 8의 일 실시예에서 설명한 바와 같이 사용자의 웹 브라우저(210)에 RTSP/RTP 통신이 지원되도록 하는 기능을 수행한다. 따라서 사용자는 별도의 플러그인 설치 없이 RTSP/RTP 프로토콜을 사용하여 HTML5 환경의 웹 브라우저(210)를 통해 미디어를 재생하는 것이 가능하다.As described in the embodiment of FIG. 8, the RTSP / RTP client module 121 performs a function of supporting RTSP / RTP communication to the web browser 210 of the user. Therefore, the user can play the media through the web browser 210 of the HTML5 environment using the RTSP / RTP protocol without installing a separate plug-in.

수신부(141)를 거친 미디어 데이터는 데이터 분리부에 의해 비디오 데이터와 오디오 데이터로 분리된다. 비디오 데이터는 RTSP/RTP 클라이언트 모듈(121)로부터 왼쪽 아래 화살표를 따라, 오디오 데이터는 오른쪽 아래 화살표를 따라 각각 디패킷화 모듈(122a, 122b)로 전달된다. 디패킷화 모듈(122a, 122b)은 비디오 데이터와 오디오 데이터의 디패킷화를 수행하며 디패킷화를 거친 비디오 데이터는 컨테이너 부(142)로, 오디오 데이터는 오디오 싱크부(144)로 전달된다. The media data that has passed through the receiver 141 is separated into video data and audio data by the data separator. Video data is passed from the RTSP / RTP client module 121 to the depacketization modules 122a and 122b, respectively, along the lower left arrow and the audio data along the lower right arrow. The depacketizing modules 122a and 122b depacketize the video data and the audio data, and the depacketized video data is delivered to the container unit 142 and audio data is delivered to the audio sink unit 144.

컨테이너 부(142)는 컨테이너 생성 모듈(127)을 포함하며, 컨테이너 생성 모듈(127)에 의해 비디오 데이터가 컨테이너 형식으로 되어 있지 않은 경우 비디오 프레임들을 단위 개수로 패키징하여 청크 데이터로 변환한다. The container unit 142 includes a container generation module 127. When the video data is not in the container format, the container generation module 127 packages the video frames into unit numbers and converts the chunks into chunk data.

컨테이너 생성 모듈(127)에서 청크 데이터를 생성하는 과정을 설명하기 위해 도 15를 참조한다. 도 15의 비디오 프레임들(311)은 디패킷화 모듈(122a)에서 디패킷화 되어 얻어지는, 프레임 단위로 구성된 비디오 데이터를 나타낸다. 이때, 비디오 프레임들(311)은 MPEG-2, H.264, H.265 등 다양한 비디오 코덱으로 압축된 상태의 데이터이다. Referring to FIG. 15 to describe the process of generating the chunk data in the container generation module 127. The video frames 311 of FIG. 15 represent video data configured in units of frames, which are obtained by depacketizing in the depacketizing module 122a. In this case, the video frames 311 are data compressed with various video codecs such as MPEG-2, H.264, H.265.

디패킷화 모듈(122a)이 비디오 프레임들(311)을 컨테이너 생성 모듈(127)로 전달하면, 컨테이너 생성 모듈(127)은 헤더 정보를 이용하여 전달받은 데이터가 컨테이너 형식인지 여부를 판단한다. 비디오 프레임들(311)은 프레임 형식의 데이터이므로, MSE(134) 및 비디오 태그(135)에서 처리되도록 하기 위해서 컨테이너 생성 모듈(127)은 비디오 프레임들(311)을 컨테이너 형식으로 변환하게 된다.When the depacketization module 122a delivers the video frames 311 to the container generation module 127, the container generation module 127 determines whether the received data is in a container format using header information. Since the video frames 311 are frame format data, the container generation module 127 converts the video frames 311 into a container format in order to be processed in the MSE 134 and the video tag 135.

프레임 형식에서 컨테이너 형식으로의 변환은 복수의 프레임들을 단위 개수로 패키징하여 이루어진다. 이때, 단위 개수는 비디오 데이터의 fps(frames per second)에 따라 가변적으로 선택될 수 있다. 도 15는 단위 개수가 3인 경우의 청크 데이터 생성과정을 도시한다. 컨테이너 생성 모듈(127)은 비디오 프레임들(311)의 각각의 프레임 헤더 정보(프레임의 크기, 예측 방식(intra or inter), 인터 프레임 종류(I, B or P) 등)를 파싱(parsing)하여 컨테이너 헤더를 생성하고, 비디오 프레임들(311)의 데이터들과 컨테이너 헤더를 패키징하여 컨테이너 형식의 청크 데이터(312)로 변환한다. 이렇게 컨테이너 형식으로 변환된 청크 데이터(312)는 MSE(134) 및 비디오 태그(135)를 포함하는 미디어 복원부(143)에서 호환성 문제 없이 처리될 수 있다.The conversion from the frame format to the container format is achieved by packaging a plurality of frames in unit numbers. In this case, the number of units may be variably selected according to frames per second (fps) of the video data. 15 illustrates a process of generating chunk data when the number of units is three. The container generation module 127 parses each frame header information (frame size, intra or inter, inter frame type (I, B or P), etc.) of the video frames 311 and parses them. The container header is generated, and the container header and the data of the video frames 311 are packaged and converted into the chunk data 312 in the container format. The chunk data 312 converted into the container format may be processed by the media reconstruction unit 143 including the MSE 134 and the video tag 135 without a compatibility problem.

다시 도 12로 돌아오면, 컨테이너 생성 모듈(127)에서 오디오 청크부(128)로 점선으로 된 화살표가 그려져 있는데, 이는 오디오 데이터의 패키징을 위한 정보가 전달되는 것을 의미한다. 전달되는 정보에 관해서는 이하 오디오 청크부(128)를 통해 자세히 설명한다.12, the dotted line arrow is drawn from the container generation module 127 to the audio chunk 128, which means that information for packaging audio data is delivered. The transmitted information will be described in detail through the audio chunk 128.

미디어 복원부(143)는 웹 브라우저에 임베드 된 디코더에 의해 상기 청크 데이터를 디코드하여 비디오를 복원하고, 복원된 비디오를 출력한다. 이때, 웹 브라우저에 임베드 된 디코더는 비디오 태그일 수 있다. 도 12의 실시예에서 미디어 복원부(143)는 MSE(134)와 비디오 태그(135)를 포함한다. The media reconstruction unit 143 decodes the chunk data by a decoder embedded in a web browser to reconstruct the video, and output the reconstructed video. In this case, the decoder embedded in the web browser may be a video tag. In the embodiment of FIG. 12, the media reconstruction unit 143 includes an MSE 134 and a video tag 135.

MSE(134)는 HTTP 다운로드를 이용해 동영상 스트리밍 재생을 위해 만들어진 HTML5용 자바스크립트 API이다. W3C에 의해 표준화 된 이 기술은 엑스박스(Xbox)와 플레이스테이션4(PS4) 같은 콘솔 게임기나 크롬 캐스트 브라우저 등에서 스트리밍 재생을 가능하게 해준다.MSE 134 is a JavaScript API for HTML5 that is designed for video streaming playback using HTTP download. Standardized by the W3C, the technology enables streaming playback on console game machines such as Xbox and Playstation 4 (PS4) or Chromecast browsers.

비디오 태그(135)에서는 디코딩 및 렌더링을 수행하여 비디오가 웹 브라우저에 디스플레이 되도록 한다. 비디오 태그(135)의 디코더를 사용하면 자바스크립트의 동적 언어 특성으로 인해 한계를 지닌 JS 디코더 모듈(124)보다 우수한 성능으로 디코딩이 가능하다. 즉, 고해상도 영상과 높은 fps의 디코딩이 가능하다.The video tag 135 performs decoding and rendering so that the video is displayed in the web browser. Using the decoder of the video tag 135 allows for better decoding than the JS decoder module 124, which has limitations due to the dynamic language nature of JavaScript. That is, high resolution video and high fps decoding are possible.

이상 설명한 모듈들에 의한 비디오의 출력과정을 정리하자면, 비디오 데이터는 데이터 분리부에서 분리되어 컨테이너 부(142)로 전달되고, 컨테이너 부(142)는 비디오 데이터가 컨테이너 단위로 되어있지 않은 경우 프레임들을 단위 개수로 패키징하여 청크 데이터로 변환한다. 청크 데이터로 구성된 비디오 데이터는 미디어 복원부(143)에서 디코딩과 렌더링 과정을 거쳐 출력된다. 다음으로 오디오 데이터의 출력과정을 살펴본다.To summarize the video output process by the modules described above, the video data is separated from the data separating unit and transferred to the container unit 142, and the container unit 142 selects frames when the video data is not in a container unit. Packaged in unit numbers and converted to chunk data. The video data composed of the chunk data is output by the media reconstruction unit 143 after decoding and rendering. Next, the output process of the audio data will be described.

데이터 분리부에서 분리된 오디오 데이터는 디패킷화 모듈(122b)에서 디패킷화된 뒤 오디오 싱크부(144)를 통해 비디오 데이터와 동기화되어 출력된다. 오디오 싱크부(144)는 오디오 디코더(126), 오디오 청크부(128), 버퍼 컨트롤러(129), 및 오디오 렌더러(136)를 포함할 수 있다.The audio data separated by the data separator is depacketized by the depacketization module 122b and then output in synchronization with the video data through the audio sinker 144. The audio sink 144 may include an audio decoder 126, an audio chunk 128, a buffer controller 129, and an audio renderer 136.

오디오 디코더(126)는 분리된 오디오 데이터를 웹 브라우저에서 파싱할 수 있는 스크립트에 의해 디코드한다. 이때 웹 브라우저에서 파싱할 수 있는 스크립트는 자바스크립트일 수 있다. The audio decoder 126 decodes the separated audio data by a script capable of parsing the web browser. In this case, the script that can be parsed by the web browser may be JavaScript.

오디오 청크부(128)는 오디오 디코더(126)에서 디코드 된 오디오 데이터를 패키징하여 오디오 청크를 생성한다. 오디오 청크부(128)에서는 컨테이너 생성 모듈(127)에서 생성되는 청크 데이터와 대응되는 범위로 오디오 데이터를 패키징하여야 하므로, 오디오 청크부(128)는 컨테이너 생성 모듈(127)로부터 청크 데이터에 관한 정보를 제공 받는다. 컨테이너 생성 모듈(127)에서 오디오 청크부(128)로 연결된 점선으로 된 화살표는 상기 정보가 전달되는 것을 의미한다.The audio chunk unit 128 generates audio chunks by packaging audio data decoded by the audio decoder 126. Since the audio chunk unit 128 should package audio data in a range corresponding to the chunk data generated by the container generation module 127, the audio chunk unit 128 receives information about the chunk data from the container generation module 127. Get provided. A dotted arrow connected to the audio chunk unit 128 in the container generation module 127 means that the information is transmitted.

오디오 청크를 생성하는 과정을 설명하기 위해 도 16을 참조한다. 도 16의 오디오 데이터(321)는 오디오 디코더(126)에서 디코딩 되어 압축이 해제된 이후, 오디오 렌더러(136)에서 렌더링 되기 이전의 오디오 데이터를 도시한 것으로 디지털 값으로 표시된 오디오 신호로서 예를 들면 통상의 웨이브(wav) 파일일 수 있다. 이러한 오디오 신호는 그 속성상 시간에 따른 진폭의 형태로 정의되므로 임의의 구간을 잘라냄으로써 그에 대응되는 범위에서의 오디오 신호를 추출할 수 있다. 예를 들어 비디오 데이터의 프레임 레이트(frame rate)가 30fps, 컨테이너 생성 모듈(127)에서 청크 데이터를 구성하는 프레임들의 단위 개수가 3인 경우 컨테이너 생성 모듈(127)은 30fps와 단위 개수 3의 정보를 오디오 청크부(128)로 전달한다. 비디오 프레임 하나에 해당하는 시간이 1/30초이고 단위 개수가 3인 경우이므로, 오디오 청크부(128)는 세 개의 비디오 프레임에 해당하는 시간인 1/10초로 오디오 데이터(321)를 패키징하여 오디오 청크(322)를 생성한다. Reference is made to FIG. 16 to describe the process of generating an audio chunk. The audio data 321 of FIG. 16 shows audio data after being decoded by the audio decoder 126 and decompressed, and before being rendered by the audio renderer 136. For example, the audio data 321 of FIG. It may be a wave file of. Since the audio signal is defined in the form of amplitude over time, it is possible to extract an audio signal in a corresponding range by cutting out an arbitrary section. For example, if the frame rate of the video data is 30 fps and the container generation module 127 has a unit number of frames constituting the chunk data, the container generation module 127 may use the information of 30 fps and the unit number 3. The audio chunk 128 is transmitted. Since the time corresponding to one video frame is 1/30 second and the number of units is 3, the audio chunk unit 128 packages the audio data 321 at 1/10 second corresponding to three video frames. Create chunk 322.

이상과 같이 청크 데이터와 동기화되는 오디오 데이터인 오디오 청크와 컨테이너 생성 모듈(127)에서 생성되는 청크 데이터를 비교하면, 청크 데이터는 비디오 프레임을 단위 개수로 패키징하거나 비디오 프레임과 오디오를 단위 개수로 패키징한 것이고 오디오 청크는 오디오 데이터를 상기 단위 개수에 맞게 패키징한 것이다. 그리고 청크 데이터는 디코더(135a)에서 디코드 되기 전의 압축된 데이터이고 오디오 청크는 오디오 디코더(126)에 의해 디코드 되어 압축 해제된 데이터이다. As described above, when audio chunks, which are audio data synchronized with chunk data, and chunk data generated by the container generation module 127 are compared, the chunk data may be packaged in units of video frames or in units of video frames and audio. The audio chunk is a package of audio data according to the number of units. The chunk data is compressed data before being decoded by the decoder 135a and the audio chunk is data decoded and decompressed by the audio decoder 126.

다시 도 12로 돌아오면, 버퍼 컨트롤러(129)는 오디오 청크부(128)로부터 오디오 청크를 전달받아 오디오 버퍼에 버퍼링하고, 버퍼링된 오디오 청크를 오디오 렌더러(136)에 제공한다. 12, the buffer controller 129 receives the audio chunk from the audio chunk 128 and buffers the audio chunk in the audio buffer, and provides the buffered audio chunk to the audio renderer 136.

버퍼 컨트롤러(129)는 버퍼링을 통해 오디오 데이터를 비디오 데이터와 동기화시켜 출력하는 것을 가능하게 해주는 모듈이다. 자세히 설명하자면, 미디어 복원부(143)가 렌더러(135b)에 비디오가 출력되는 시점에 관한 타임 정보를 청크 데이터 단위로 버퍼 컨트롤러(129)로 전달하면, 버퍼 컨트롤러(129)는 오디오 청크를 버퍼링하다가 상기 타임 정보를 통해 비디오가 출력되는 시점에 맞추어 오디오 청크를 오디오 렌더러(136)에 제공한다. 도 12의 렌더러(135b)에서 버퍼 컨트롤러(129)로의 점선으로 된 화살표는 상기 타임 정보가 미디어 복원부(143)로부터 버퍼 컨트롤러(129)로 전달되는 것을 의미한다. 이때, 타임 정보는 상기 타임 정보에 대응되는 청크 데이터가 렌더러(135b)에서 렌더링되는 시점을 나타내며, 청크 데이터가 렌더링 되는 시점의 기준은 상기 타임 정보에 대응되는 청크 데이터 중에서 시작 부분이 상기 렌더러에 의해 렌더링되는 시점일 수 있다.The buffer controller 129 is a module that makes it possible to output audio data in synchronization with video data through buffering. In detail, when the media reconstructor 143 transmits time information about a time point at which a video is output to the renderer 135b to the buffer controller 129 in chunk data units, the buffer controller 129 buffers the audio chunk. The audio chunk is provided to the audio renderer 136 according to the time point at which the video is output through the time information. A dotted arrow from the renderer 135b of FIG. 12 to the buffer controller 129 means that the time information is transferred from the media restoring unit 143 to the buffer controller 129. In this case, the time information indicates a time point at which the chunk data corresponding to the time information is rendered by the renderer 135b, and a criterion for the time point at which the chunk data is rendered is the start of the chunk data corresponding to the time information by the renderer. It may be the time at which it is rendered.

오디오 렌더러(136)는 버퍼 컨트롤러(129)로부터 전달 받은 오디오 청크를 출력하기 위해 렌더링을 수행한다. 오디오 렌더러(136)는 웹 브라우저에 의해 지원되는 Web Audio API로 구현될 수 있다.The audio renderer 136 performs rendering to output the audio chunk received from the buffer controller 129. The audio renderer 136 may be implemented with a Web Audio API supported by a web browser.

상기 설명한 모듈들을 통한 오디오의 출력과정을 정리하자면, 오디오 데이터는 데이터 분리부에서 분리되어 오디오 디코더(126)로 전달된다. 오디오 디코더(126)에서 오디오 데이터는 디코딩되어 복원되고, 복원된 오디오는 오디오 청크부(128)에서 오디오 청크로 패키징되는데, 패키징은 청크 데이터를 구성하는 프레임들의 단위 개수에 맞추어 이루어진다. 오디오 청크로 패키징된 오디오 데이터는 버퍼 컨트롤러(129)에서 버퍼링되다가 비디오가 출력되는 시점에 맞추어 오디오 렌더러(136)를 통해 출력된다. 이때, 비디오가 출력되는 시점은 미디어 복원부(143)에서 버퍼 컨트롤러(129)로 전달되는 타임 정보에 담겨있다.To summarize the process of outputting audio through the above-described modules, the audio data is separated from the data separator and transferred to the audio decoder 126. In the audio decoder 126, the audio data is decoded and reconstructed, and the reconstructed audio is packaged in the audio chunk 128 as an audio chunk. The packaging is performed according to the unit number of frames constituting the chunk data. The audio data packaged into the audio chunk is buffered by the buffer controller 129 and then output through the audio renderer 136 in accordance with the time point at which the video is output. In this case, the time point at which the video is output is contained in time information transmitted from the media reconstruction unit 143 to the buffer controller 129.

도 12의 미디어 재생 장치(120)는 오디오 데이터의 트랜스코딩 없이 비디오 데이터와 오디오 데이터를 동기화하므로 트랜스코딩으로 인한 오디오 데이터의 손상 염려가 없다. 또한 도 12의 미디어 재생 장치(120)에 의하면 비디오 데이터와 오디오 데이터가 서로 다른 디코더에 의해 디코딩 되더라도 동기화가 문제가 되지 않으므로, 비디오 데이터와 오디오 데이터를 분리하여 처리할 수 있는 환경을 제공한다. 즉, 비디오 태그에서 지원하지 않는 코덱 형식의 오디오 데이터를 별도의 디코더로 복원하더라도 동기화가 문제되지 않으므로 비디오 태그에서 지원하는 코덱 형식에 대한 종속성이 감소한다.Since the media reproducing apparatus 120 of FIG. 12 synchronizes the video data and the audio data without transcoding the audio data, there is no fear of damaging the audio data due to transcoding. In addition, according to the media reproducing apparatus 120 of FIG. 12, the synchronization is not a problem even if the video data and the audio data are decoded by different decoders, thereby providing an environment in which the video data and the audio data can be processed separately. That is, even if the audio data of a codec format not supported by the video tag is restored to a separate decoder, synchronization is not a problem, so the dependency on the codec format supported by the video tag is reduced.

도 13은 미디어 재생 장치(120)의 다른 실시예를 나타낸다. 도 12의 실시예에서 오디오 데이터가 복원된 뒤 비디오 데이터와 동기화 과정을 거쳤다면, 도 13의 실시예에서는 오디오 데이터가 비디오 데이터와 동기화 과정을 거친 뒤 디코드 되어 복원된다. 도 12의 실시예와 비교하였을 때, 도 13의 실시예에서 오디오 디코더(126)의 위치가 버퍼 컨트롤러(129)의 후단으로 변경되어 있다. 그 외의 구성은 도 12의 실시예와 동일하므로 중복되는 설명은 생략하고 이하 변경된 오디오 싱크부(144)에 대하여 설명한다.13 shows another embodiment of a media playback device 120. In the embodiment of FIG. 12, if audio data is restored and then synchronized with video data, the audio data is decoded and restored after synchronizing with video data. Compared to the embodiment of FIG. 12, in the embodiment of FIG. 13, the position of the audio decoder 126 is changed to the rear end of the buffer controller 129. Since the rest of the configuration is the same as in the embodiment of FIG. 12, overlapping descriptions are omitted and the changed audio sink unit 144 will be described below.

도 13의 실시예에 따르면 오디오 청크부(128)에서는 디코드하여 복원되기 전의 오디오 데이터를 패키징하여 오디오 청크를 생성한다. 오디오 청크부(128)에서는 컨테이너 생성 모듈(127)에서 생성되는 청크 데이터와 대응되는 범위로 오디오 데이터를 패키징하기 위해 컨테이너 생성 모듈(127)로부터 청크 데이터에 관한 정보를 제공 받는다.According to the embodiment of FIG. 13, the audio chunk unit 128 generates audio chunks by packaging audio data before decoding and restoring the audio chunks 128. The audio chunk unit 128 receives information about the chunk data from the container generation module 127 in order to package the audio data into a range corresponding to the chunk data generated by the container generation module 127.

버퍼 컨트롤러(129)는 렌더러(135b)로부터 수신한 타임 정보를 통해 오디오 청크를 비디오 데이터와 동기화시켜 오디오 디코더(126)로 전달한다. 도 13의 실시예에서는 청크 데이터와 동기화되는 오디오 데이터가 오디오 디코더(126)에 의해 디코드 되기 전의 데이터로, 이는 도 12의 실시예와 구별점에 해당한다. The buffer controller 129 synchronizes the audio chunk with the video data through the time information received from the renderer 135b and transmits the audio chunk to the audio decoder 126. In the embodiment of FIG. 13, the audio data synchronized with the chunk data is data before being decoded by the audio decoder 126, which corresponds to the distinguishing point from the embodiment of FIG. 12.

오디오 디코더(126)는 동기화 된 오디오 데이터를 복원하여 오디오 렌더러(136)로 전달하고 오디오 렌더러(136)는 복원된 오디오 데이터를 출력하기 위해 렌더링을 수행한다.The audio decoder 126 restores the synchronized audio data to the audio renderer 136 and the audio renderer 136 performs rendering to output the restored audio data.

도 12 및 도 13의 실시예에서는 비디오 데이터를 기준으로 오디오 데이터를 동기화 시키는 경우를 예로 들었으나, 이와 반대로 오디오 데이터를 기준으로 비디오 데이터를 동기화 시키는 것도 가능하다. 12 and 13 illustrate a case in which audio data is synchronized based on video data. On the contrary, video data may be synchronized based on audio data.

미디어 데이터를 제1 미디어 타입 데이터(오디오)와 제2 미디어 타입 데이터(비디오)라 하고, 도 12 및 도 13의 실시예에 적용해 본다. The media data is referred to as the first media type data (audio) and the second media type data (video), and is applied to the embodiments of FIGS. 12 and 13.

수신부(141)는 미디어 데이터를 수신하여 데이터 분리부로 전달한다. 데이터 분리부는 제1 미디어 타입 데이터(오디오)와 제2 미디어 타입 데이터(비디오)로 미디어 데이터를 분리한다. The receiver 141 receives the media data and transmits the media data to the data separator. The data separator separates media data into first media type data (audio) and second media type data (video).

컨테이너 부(142)에서는 제1 미디어 타입 데이터(오디오)가 컨테이너 단위로 되어있지 않은 경우, 제1 미디어 타입 데이터(오디오)를 구성하는 프레임들을 단위 개수로 패키징하여 청크 데이터로 변환한다. When the first media type data (audio) is not in a container unit, the container unit 142 packages the frames constituting the first media type data (audio) by the unit number and converts the chunk data into chunk data.

미디어 복원부(143)는 웹 브라우저에 임베드 된 디코더에 의해 상기 청크 데이터를 디코드하여 제1 미디어 타입 데이터(오디오)를 복원하고, 제1 미디어 타입 데이터(오디오)를 출력할 때 해당 시점에 관한 타임 정보를 청크 데이터 단위로 싱크부(도12 및 도 13에서 오디오 싱크부에 대응)에 전달한다. 이때, 웹 브라우저에 임베드 된 디코더는 비디오 태그일 수 있으며, 비디오 태그로 오디오 데이터의 디코딩과 렌더링이 가능하므로 미디어 복원부(143)에서 제1 미디어 타입 데이터(오디오)를 출력하는데 문제되지 않는다. The media reconstruction unit 143 decodes the chunk data by a decoder embedded in a web browser, restores first media type data (audio), and outputs the first media type data (audio). Information is transmitted to the sink unit (corresponding to the audio sink unit in FIGS. 12 and 13) in chunk data units. In this case, the decoder embedded in the web browser may be a video tag, and since the audio data may be decoded and rendered by the video tag, the media reconstructor 143 may not output the first media type data (audio).

싱크부는 전달받은 타임 정보에 기초하여 제2 미디어 타입 데이터(비디오)를 상기 복원된 제1 미디어 타입 데이터(오디오)와 동기화하여 출력한다.The sink unit outputs the second media type data (video) in synchronization with the restored first media type data (audio) based on the received time information.

이상과 같이, 제1 미디어 타입 데이터가 오디오이고 제2 미디어 타입 데이터가 비디오인 경우에도 도 12 및 도 13의 실시예에 따라 동기화된 미디어의 재생이 가능한 것을 살펴보았다. As described above, even when the first media type data is audio and the second media type data is video, the synchronized media can be reproduced according to the embodiments of FIGS. 12 and 13.

이때, 제1 미디어 타입 데이터가 오디오이고 제2 미디어 타입 데이터가 비디오인 것은 다양한 실시예를 보여주기 위한 설정일 뿐, 제1 미디어 타입 데이터와 제2 미디어 타입 데이터의 유형을 한정하는 것은 아니다. In this case, the first media type data is audio and the second media type data is video is only a setting for showing various embodiments and does not limit the types of the first media type data and the second media type data.

다음으로 미디어 재생 장치(120)의 다른 구조의 실시예를 살펴본다. 도 14는 미디어 재생 장치(120)의 또 다른 실시예를 나타낸다. 도 14의 미디어 재생 장치(120)는 수신부, 데이터 분리부, 컨테이너 부, 미디어 복원부 및 오디오 복원부를 포함한다. 이때, 수신부, 데이터 분리부, 컨테이너 부 및 오디오 복원부는 자바스크립트로 구현될 수 있다. 도 14의 미디어 재생 장치(120)는 도 9 및 도 10의 스크립트 모듈을 수신하여 구성될 수 있다.Next, an embodiment of another structure of the media reproducing apparatus 120 will be described. 14 shows another embodiment of a media playback device 120. The media player 120 of FIG. 14 includes a receiver, a data separator, a container unit, a media decompression unit, and an audio decompression unit. In this case, the receiver, the data separator, the container, and the audio restorer may be implemented in JavaScript. The media player 120 of FIG. 14 may be configured to receive the script modules of FIGS. 9 and 10.

수신부와 데이터 분리부는 도 12와 동일하게 구성된다. 미디어 서비스 장치(110)에서 생성된 미디어 데이터를 수신부를 통해 수신하고, 데이터 분리부에서 비디오 데이터와 오디오 데이터로 미디어 데이터를 분리한다. 분리된 비디오 데이터는 디패킷화 모듈(122a)에서 디패킷화를 거쳐 컨테이너 부로 전달된다.The receiver and the data separator are configured similarly to FIG. The media data generated by the media service apparatus 110 is received through a receiver, and the media separator separates the media data into video data and audio data. The separated video data is depacketized by the depacketization module 122a and transferred to the container unit.

컨테이너 부는 컨테이너 생성 모듈(127)을 포함하며, 컨테이너 생성 모듈(127)은 비디오 데이터를 구성하는 프레임들을 단위 개수로 패키징하여 청크 데이터로 변환한다. 이때 프레임들의 단위 개수는 비디오 데이터의 fps(frames per second)에 따라 가변적으로 조절이 가능하다. The container unit includes a container generation module 127, and the container generation module 127 packages the frames constituting the video data into unit numbers and converts the chunk data into chunk data. In this case, the unit number of frames may be variably adjusted according to frames per second (fps) of video data.

미디어 복원부는 MSE(134)와 비디오 태그(135)를 포함하며, 컨테이너 부로부터 전달받은 청크 데이터를 디코드하여 비디오를 복원하고 복원된 비디오를 출력하는 기능을 수행한다. The media reconstruction unit includes an MSE 134 and a video tag 135, and decodes the chunk data received from the container to reconstruct the video and output the reconstructed video.

미디어 복원부가 비디오 데이터를 복원하여 출력한다면, 오디오 복원부는 데이터 분리부에 의해 분리된 오디오 데이터를 디코드하여 복원하고, 미디어 복원부에서 출력되는 비디오 데이터와 동기화하여 상기 복원된 오디오 데이터를 출력한다. If the media restoring unit restores and outputs the video data, the audio restoring unit decodes and restores the audio data separated by the data separating unit, and outputs the restored audio data in synchronization with the video data output from the media restoring unit.

오디오 복원부는 트랜스코드부를 포함할 수 있다. 트랜스코드부는 오디오 데이터를 다른 코덱 형식으로 트랜스코드하는 모듈로 오디오 트랜스코더(123)를 포함한다. 오디오 트랜스코더(123)는 입력된 오디오 데이터가 미디어 복원부에서 지원하지 않는 포맷, 즉 비디오 태그에서 지원되지 않는 코덱 형식으로 된 경우 비디오 태그에서 지원하는 코덱 형식으로 오디오 데이터를 트랜스코딩하여 출력할 수 있다. The audio recovery unit may include a transcode unit. The transcode unit includes an audio transcoder 123 as a module for transcoding audio data into another codec format. The audio transcoder 123 may transcode and output audio data in a codec format supported by the video tag when the input audio data is in a format not supported by the media restoring unit, that is, a codec format not supported by the video tag. have.

오디오 복원부에서 트랜스코딩 된 오디오는 컨테이너 생성 모듈(127)로 전달된다. 컨테이너 생성 모듈(127)은 디패킷화 모듈(122a)로부터 전달받은 비디오 데이터가 컨테이너 단위로 되어있지 않은 경우 비디오 데이터를 구성하는 프레임들을 단위 개수로 패키징하면서 오디오 복원부로부터 전달받은 오디오 데이터를 상기 단위 개수로 패키징하여 청크 데이터를 생성할 수 있다. 이렇게 생성된 청크 데이터는 호환성 문제 없이 MSE(134)로 전달될 수 있다.The transcoded audio is transmitted to the container generation module 127 in the audio reconstruction unit. When the video data received from the depacketization module 122a is not a container unit, the container generation module 127 packages the audio data received from the audio reconstruction unit while packaging the frames constituting the video data in units of units. Can be packaged in number to generate chunk data. The chunk data generated in this way may be transferred to the MSE 134 without compatibility problems.

오디오 복원부에서 오디오 데이터를 디코딩하고 렌더링하여 출력하는 프로세스는 미디어 복원부에 포함되어 구성될 수 있는데, 이는 컨테이너 부에서 비디오 데이터와 오디오 데이터로 청크 데이터를 생성하기 때문이다. 따라서 미디어 복원부의 MSE(134)와 비디오 태그(135)를 통해 청크 데이터를 디코딩 및 렌더링함으로써 비디오 데이터와 오디오 데이터를 출력할 수 있다.The process of decoding, rendering and outputting audio data in the audio reconstruction unit may be included in the media reconstruction unit because the container unit generates chunk data from the video data and the audio data. Accordingly, the video data and the audio data can be output by decoding and rendering the chunk data through the MSE 134 and the video tag 135 of the media reconstruction unit.

도 14의 미디어 재생 장치(120)를 이용하면 오디오 데이터의 트랜스코딩과 청크 데이터로 변환을 통해, 비디오 태그에서 지원하지 않는 코덱 형식의 오디오 데이터를 비디오 태그로 디코딩 및 렌더링하는 것이 가능하다. 따라서 비디오 데이터와 함께 오디오 데이터의 복원이 이루어지므로 동기화 문제 없이 미디어 데이터의 재생이 가능하다. Using the media reproducing apparatus 120 of FIG. 14, it is possible to decode and render audio data of a codec format that is not supported by a video tag by transcoding and converting the audio data into chunk data. Therefore, the audio data is restored along with the video data, so that the media data can be reproduced without a synchronization problem.

도 17은 본 발명의 일 실시예에 따른 자바스크립트로 구현된 스크립트 모듈을 생성하는 과정을 설명하기 위한 예시도이다. 17 is an exemplary diagram for describing a process of generating a script module implemented with JavaScript according to an embodiment of the present invention.

도 17을 참고하면 자바스크립트로 구현된 스크립트 모듈은 종래의 C와 C++의 네이티브 코드로 작성된 소스를 Emscripten과 같은 컨버터(converter)로 변환하여 브라우저에서 사용할 수 있는 자바스크립트 코드를 얻을 수 있다.Referring to FIG. 17, a script module implemented in JavaScript may convert a source code written in native C and C ++ native code into a converter such as Emscripten to obtain JavaScript code that can be used in a browser.

Emscripten과 같은 컨버터(converter)를 이용하면, 종래의 네이티브 코드로부터 자바스크립트로 구현된 디코더나 컨테이너를 얻을 수 있으므로 코덱 종속성을 낮출 수 있는 장점이 있다.When using a converter such as Emscripten, a decoder or a container implemented in JavaScript can be obtained from conventional native code, thereby reducing the codec dependency.

플러그인 대신에 자바스크립트 코드를 사용하므로 웹 브라우저 제작 업체의 지원 중단을 염려할 필요도 없다. 또한, 웹 브라우저에 따라 ActiveX 인터페이스를 사용할지 NPAPI 인터페이스를 사용할지 고민할 필요가 없다. 즉, 웹 브라우저 종속성을 낮출 수 있는 장점이 있다.JavaScript code is used instead of plug-ins, so you don't have to worry about web browser makers deprecating. In addition, there is no need to worry about whether to use the ActiveX interface or the NPAPI interface depending on the web browser. In other words, the web browser dependency can be lowered.

도 1에 도시된 미디어 재생 장치(120)는 예를 들어, 다음의 도 18에 도시된 컴퓨팅 장치(400)로 구현될 수 있다. 컴퓨팅 장치(400)는 모바일 핸드헬드 기기들(스마트 폰, 태블릿 컴퓨터 등), 랩톱 또는 노트북 컴퓨터, 분산된 컴퓨터 시스템, 컴퓨팅 그리드 또는 서버일 수 있으나 이에 한하지는 않는다. 컴퓨팅 장치(400)는 버스(440)를 통해 서로 간에 또는 다른 요소들과 통신하는 프로세서(401) 및 메모리(403) 및 스토리지(408)를 포함할 수 있다. 버스(440)는 디스플레이(432), 하나 이상의 입력 장치들(433), 하나 이상의 출력 장치들(434)과 연결될 수 있다. The media playback device 120 illustrated in FIG. 1 may be implemented with, for example, the computing device 400 illustrated in FIG. 18. Computing device 400 may be, but is not limited to, mobile handheld devices (smart phone, tablet computer, etc.), laptop or notebook computer, distributed computer system, computing grid or server. Computing device 400 may include processor 401 and memory 403 and storage 408 in communication with each other or with other elements via bus 440. The bus 440 may be connected to the display 432, one or more input devices 433, and one or more output devices 434.

이러한 모든 요소들은 버스에(400) 직접 또는 하나 이상의 인터페이스들 또는 어댑터들을 통하여 접속될 수 있다. 버스(440)는 광범위한 서브 시스템들과 연결된다. 버스(440)는 메모리 버스, 메모리 컨트롤러, 주변 버스(peripheral bus), 로컬 버스 및 이들의 조합을 포함할 수 있다.All these elements may be connected directly to the bus 400 or through one or more interfaces or adapters. Bus 440 is connected to a wide variety of subsystems. The bus 440 may include a memory bus, a memory controller, a peripheral bus, a local bus, and a combination thereof.

메모리(403)는 RAM(404, random access memory), ROM(405, read-only component) 및 이들의 조합으로 구성될 수 있다. 또한, 컴퓨팅 장치(400) 내에서 부팅(booting)을 위해 필요한 기본 루틴들을 구비하는 바이오스(내지 펌웨어)가 메모리(403) 내에 포함될 수 있다.The memory 403 may be configured of a random access memory (RAM) 404, a read-only component (ROM) 405, and a combination thereof. In addition, a BIOS (or firmware) having basic routines necessary for booting within the computing device 400 may be included in the memory 403.

스토리지(408)는 오퍼레이팅 시스템(409), 실행 가능한 파일들(EXEC, 410), 데이터(411), API 어플리케이션(412) 등을 저장하기 위해 사용된다. 스토리지(408)는 하드 디스크 드라이브, 광디스크 드라이브, SSD(solid-state memory device) 등을 포함할 수 있다.Storage 408 is used to store operating system 409, executable files (EXEC, 410), data 411, API application 412, and the like. The storage 408 may include a hard disk drive, an optical disk drive, a solid-state memory device (SSD), or the like.

컴퓨팅 장치(400)는 입력 장치(433)을 포함할 수 있다. 사용자는 입력 장치(433)를 통하여 컴퓨터 장치(400)에 명령 및/또는 정보를 입력할 수 있다. 입력 장치(433)의 예로는, 키보드, 마우스, 터치패드, 조이스틱, 게임패드, 마이크로폰, 광학 스캐너, 카메라 등이 있다. 입력 장치(433)는 직렬 포트, 병렬 포트, 게임 포트, USB 등을 포함하는 입력 인터페이스(423)를 통해 버스(440)에 접속될 수 있다.Computing device 400 may include an input device 433. The user may input commands and / or information to the computer device 400 through the input device 433. Examples of the input device 433 include a keyboard, a mouse, a touch pad, a joystick, a game pad, a microphone, an optical scanner, a camera, and the like. The input device 433 may be connected to the bus 440 through an input interface 423 including a serial port, a parallel port, a game port, a USB, and the like.

특정 실시예에서, 컴퓨팅 장치(400)는 네트워크(430)에 연결된다. 컴퓨팅 장치(400)는 다른 장치들과 네트워크(430)를 통하여 연결된다. 이 때, 네트워크 인터페이스(420)는 네트워크(430)로부터 하나 이상의 패킷의 형태로 된 통신 데이터를 수신하고, 컴퓨팅 장치(400)는 프로세서(401)의 처리를 위해 상기 수신된 통신 데이터를 저장한다. 마찬가지로, 컴퓨팅 장치(400)는 전송한 통신 데이터를 하나 이상의 패킷 형태로 메모리(403) 상에 저장하고, 네트워크 인터페이스(420)는 상기 통신 데이터를 네트워크(430)로 전송한다.In certain embodiments, computing device 400 is connected to network 430. Computing device 400 is coupled with other devices through network 430. At this time, the network interface 420 receives communication data in the form of one or more packets from the network 430, and the computing device 400 stores the received communication data for processing by the processor 401. Similarly, the computing device 400 stores the transmitted communication data in the form of one or more packets on the memory 403, and the network interface 420 transmits the communication data to the network 430.

네트워크 인터페이스(420)는 네트워크 인터페이스 카드, 모뎀 등을 포함할 수 있다. 네트워크(430)의 예로는, 인터넷, WAN(wide area network), LAN(local area network), 전화 네트워크, 직접 연결 통신 등이 있으며, 유선 및/또는 유선 통신 방식을 채용할 수 있다.The network interface 420 may include a network interface card, a modem, and the like. Examples of the network 430 include the Internet, a wide area network (WAN), a local area network (LAN), a telephone network, direct connection communication, and the like, and may employ a wired and / or wired communication scheme.

프로세서(401)에 의한 소프트웨어 모듈의 실행 결과는 디스플레이(432)를 통해 표시될 수 있다. 디스플레이(432)의 예로는, LCD(liquid crystal display), OLED(organic liquid crystal display), CRT(cathode ray tube), PDP(plasma display panel) 등이 있다. 디스플레이(432)는 비디오 인터페이스(422)를 통하여 버스(440)에 연결되고, 디스플레이(432)와 버스(440) 간의 데이터 전송은 그래픽 컨트롤러(421)에 의해 제어될 수 있다.The execution result of the software module by the processor 401 may be displayed on the display 432. Examples of the display 432 include a liquid crystal display (LCD), an organic liquid crystal display (OLED), a cathode ray tube (CRT), a plasma display panel (PDP), and the like. Display 432 is connected to bus 440 via video interface 422, and data transmission between display 432 and bus 440 may be controlled by graphics controller 421.

디스플레이(432)와 더불어, 컴퓨팅 장치(400)는 오디오 스피커, 프린터 등과 같은 하나 이상의 출력 장치(434)를 포함할 수 있다. 출력 장치(434)는 출력 인터페이스(424)를 통해 버스(440)와 연결된다. 출력 인터페이스(424)는 예를 들어, 직렬 포트, 병렬 포트, 게임 포트, USB 등이 있다.In addition to the display 432, the computing device 400 may include one or more output devices 434, such as audio speakers, printers, and the like. Output device 434 is coupled to bus 440 via output interface 424. The output interface 424 is, for example, a serial port, a parallel port, a game port, a USB, and the like.

본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.Those skilled in the art will appreciate that the present invention can be embodied in other specific forms without changing the technical spirit or essential features of the present invention. Therefore, it should be understood that the embodiments described above are exemplary in all respects and not restrictive. The scope of the present invention is shown by the following claims rather than the above description, and all changes or modifications derived from the meaning and scope of the claims and their equivalents should be construed as being included in the scope of the present invention. do.

110: 미디어 서비스 장치 111: 카메라
112: 인코더 113: 패킷화부
114: 웹 서버 115: 모듈 저장부
116: 모듈 전송부 117: 제어부
120: 미디어 재생 장치 121: RTSP/RTP 클라이언트 모듈
122: 디패킷화 모듈 123: 오디오 트랜스코더
124: JS 디코더 모듈 125: JS 렌더러 모듈
126: 오디오 디코더 127: 컨테이너 생성 모듈
128: 오디오 청크부 129: 버퍼 컨트롤러
141: 수신부 142: 컨테이너 부
143: 미디어 복원부 144: 오디오 싱크부
210: 웹 브라우저 110: media service device 111: camera
112: encoder 113: packetizer
114: web server 115: module storage unit
116: module transmission unit 117: control unit
120: media playback device 121: RTSP / RTP client module
122: depacketization module 123: audio transcoder
124: JS decoder module 125: JS renderer module
126: audio decoder 127: container generation module
128: audio chunk 129: buffer controller
141: receiver 142: container
143: media recovery unit 144: audio sink unit
210: web browser

Claims

A receiving unit which receives the media data generated by the media service apparatus using a communication protocol supporting a web service;
A data separator that separates the received media data into video data and audio data;
A container unit for packaging the frames constituting the video data into unit numbers and converting the frames into chunk data;
Reconstructs the video by decoding the converted chunk data by a decoder embedded in a web browser, and when the output of the reconstructed video is output, time information about a time point at which the reconstructed video is output, in units of the chunk data. A media restoration unit provided; And
An audio sink configured to output the audio data in synchronization with the restored video based on time information provided in the chunk data unit;
The media reconstruction unit includes a renderer that renders before outputting the reconstructed video, wherein the time information indicates a time point at which chunk data corresponding to the time information is rendered in the renderer, thereby synchronizing the video and audio on a web browser. Media playback device to play.

The container unit of claim 1, wherein the container unit
And controlling the number of units in accordance with a frame per second (FPS) of the video data.

delete

The method of claim 1,
And the time information indicates a time point at which a start portion is rendered by the renderer among chunk data corresponding to the time information.

The audio sink of claim 1, wherein the audio sync unit
And decoding the separated audio data to restore audio, and outputting the restored audio in synchronization with time information provided in the chunk data unit.

The audio sink of claim 5, wherein the audio sync unit
An audio decoder which decodes the separated audio data by a parsable script in the web browser;
A buffer controller for synchronizing the decoded audio data with the chunk data corresponding to the time information and providing the decoded audio data to an audio renderer; And
And the audio renderer for rendering the decoded audio data.

The method of claim 6,
And the chunk data is compressed data before being decoded, and the audio data synchronized with the chunk data is decoded and decompressed data.

7. The audio renderer of claim 6 wherein the audio renderer
A media playback device for synchronizing and playing video and audio on a web browser, which is implemented by an audio application program interface (API) supported by the web browser.

The audio sink of claim 1, wherein the audio sync unit
And buffering the separated audio data in synchronization with the time information provided in the chunk data unit, and decoding and outputting the buffered audio data to restore and output the audio.

The method of claim 9, wherein the audio sink unit
A buffer controller configured to buffer the separated audio data in synchronization with the chunk data corresponding to the time information;
An audio decoder to decode the buffered audio data by a parsable script in the web browser; And
And an audio renderer for rendering the decoded audio data.

The method of claim 10,
And the chunk data is compressed data before being decoded, and the audio data synchronized with the chunk data is data before being decoded by the audio decoder.

The method of claim 1,
The decoder embedded in the web browser is a video tag supported by HTML5, and the receiving unit, the container unit, and the audio sink unit are implemented in JavaScript.

The method of claim 12,
And the JavaScript is downloaded to the media player from the media service device.

A media service device for transmitting media data to a media playback device.
A module storage unit for storing a script module necessary for playing the media data on a web browser of the media player;
A module transmitter for transmitting the script module to the media player in response to a connection of the media player;
A packetizer configured to packetize the media data to generate a transport packet; And
A web server for establishing a communication session with the playback device and transmitting the transport packet to the media playback device in response to a request from the media playback device,
The script module may include: receiving the transport packet through the communication session; packaging the video frames included in the transport packet in unit numbers and converting the chunk data into chunk data; and converting the chunk data into the media playback. A process of synchronizing with the chunk data and outputting audio data included in the transport packet based on time information about a time point decoded and output by a media restoring unit installed in the apparatus,
The time information indicates a time point at which chunk data corresponding to the time information is rendered by a renderer included in the media restoring unit.

The media service device of claim 14, wherein the script module is code written in JavaScript that is parseable in the web browser.

The method of claim 14,
The time information indicates a time point at which a start portion of the chunk data corresponding to the time information is rendered by the renderer in the media playback device.

15. The process of claim 14, wherein the process of synchronizing and outputting the audio data
And restoring audio by decoding the audio data included in the transport packet, and outputting the restored audio by synchronizing with the time information provided in the chunk data unit.

The process of claim 14, wherein the process of synchronizing and outputting the audio
Buffering the audio data in synchronization with chunk data corresponding to the time information;
Decoding the buffered audio data by a parsable script in the web browser; And
And process for rendering the decoded audio data.

The method of claim 14,
And the chunk data is compressed data before being decoded, and the audio data synchronized with the chunk data is data decoded and decompressed by the media player.

A receiving unit which receives the media data generated by the media service apparatus using a communication protocol supporting a web service;
A data separator configured to separate first media type data and second media type data from the received media data;
A container unit for packaging the frames constituting the first media data into unit numbers and converting the frames into chunk data;
Decode the converted chunk data by a decoder embedded in a web browser to restore a first media, and time information regarding a time point at which the restored first media is output when the restored first media is output. A media restoring unit for providing a chunk data unit; And
A sink unit configured to output the second media data in synchronization with the restored first media based on time information provided in the chunk data unit,
The media reconstruction unit includes a renderer that renders before outputting the first media, wherein the time information indicates a time point at which chunk data corresponding to the time information is rendered by the renderer. Media playback device to play.