KR101478918B1

KR101478918B1 - Apparatus and method for correcting caption subtitle

Info

Publication number: KR101478918B1
Application number: KR1020130097211A
Authority: KR
Inventors: 한성희; 하명환; 이범구; 정병희
Original assignee: 한국방송공사
Priority date: 2013-08-16
Filing date: 2013-08-16
Publication date: 2014-12-31

Abstract

The present invention relates to an apparatus for correcting captions of a broadcasting program. According to one aspect, the apparatus for correcting captions receives a caption created through stenography corresponding to a voice of broadcasting and organized schedule information of the broadcasting. A broadcasting program corresponding to the caption is searched among broadcasting programs included in the organized schedule information based on a time in which the caption is received. The information of the searched broadcasting program is matched with the caption to generate a caption file and data. A text format and a time synchronization value of each caption of the caption file are corrected based on a voice recognition result of the broadcasting and the utilization objective of the caption file.

Description

[0001] APPARATUS AND METHOD FOR CORRECTING CAPTION SUBTITLE [0002]

아래에서 설명하는 예들은 방송 프로그램의 캡션 자막을 보정하는 장치에 관한 것으로, 더욱 상세하게는 캡션 자막이 다양한 어플리케이션에서 사용될 수 있도록 보정하는 장치에 관한 것이다.The examples described below relate to a device for correcting a caption subtitle of a broadcast program, and more particularly to a device for correcting a caption subtitle to be used in various applications.

장애를 가진 사람들이 TV 방송 콘텐츠를 이용할 수 있도록, 클로즈드 캡션(closed caption) 자막, 수화, 비디오 설명 등과 같은 서비스가 방송에 사용되고 있다. 그 중에서 클로즈드 캡션 자막이 가장 광범위하게 사용되고 있다.Services such as closed caption subtitles, sign language, video description, etc. are being used for broadcasting so that people with a disability can use TV broadcast contents. Among them, closed caption subtitles are most widely used.

비록, 현재의 방송국에서는 지상파 방송에서만 전면적으로 클로즈드 캡션 자막을 지원하고 있지만, 점차적으로 웹 또는 모바일 환경의 멀티미디어에도 클로즈드 캡션 자막이 지원될 것으로 예상된다. 또한, 클로즈드 캡션 자막은 키워드 추출 또는 장면 검색을 위한 프로그램의 메타데이터로서 가치있는 자원으로 고려될 수 있으며 외국어 번역을 위한 초본으로 사용될 수 있다.Although current broadcasting stations support closed captioning entirely only in terrestrial broadcasting, it is expected that closed captioning will be gradually supported in multimedia on the web or mobile environment. In addition, closed caption subtitles can be considered as a valuable resource as metadata of a program for keyword extraction or scene retrieval, and can be used as a translation for foreign language translation.

그러므로, 방송국들은 클로즈드 캡션 자막을 효율적으로 재 사용하기 위한 방법 또는 새로운 시스템에 대한 필요에 직면하고 있다. 위와 같은 클로즈드 캡션 자막의 재사용 방법과 새로운 시스템은 클로즈드 캡션 자막의 특성과 현재 사용되고 있는 시스템과의 호환성으로 고려하여 설계될 필요가 있다.Therefore, broadcasters are faced with the need for a new system or a method for efficiently reusing closed caption subtitles. The method of reusing the closed caption caption and the new system should be designed considering the characteristics of the closed caption caption and the compatibility with the currently used system.

2008년 이후로, 한국에서는 장애인들이 방송 콘텐츠에 액세스할 수 있는 권리가 법적으로 보장되었다. 2011년에는 대상 분류, 프로그램 비율과 같은 장애인들의 방송 콘텐츠 액세스를 위한 구체적인 가이드라인이 제정되었다. Since 2008, the right of people with disabilities to access broadcast content has been legally guaranteed in Korea. In 2011, specific guidelines were established for accessing broadcast content for persons with disabilities, such as classification of targets and program rate.

본 발명은 클로즈드 캡션 자막을 프로그램 단위의 파일 형태로 저장하고, 해당 자막 파일 캡션들의 텍스트 포맷을 설정에 따라 다양한 형태로 변경하고, 시간 동기를 보정함으로써, 디지털 텔레비전(Digital Television, DTV) 재방송, 케이블, 위성 플랫폼에서의 방송 프로그램 재사용 또는 웹이나 모바일 멀티미디어 서비스 플랫폼에서의 주문형 비디오(Video On Demand, VOD) 자막 또는 외국어 번역 초벌 생성 등과 같은 등과 같은 다양한 용도의 어플리케이션에 이용되는 캡션 자막을 생성하는 장치를 제공한다.The present invention stores a closed caption subtitle in a file format of a program unit, changes the text format of the caption file captions to various forms according to the setting, corrects time synchronization, and reproduces digital television (DTV) , A device for generating caption subtitles for use in various applications such as reusing broadcast programs on a satellite platform or generating video on demand (VOD) subtitles or foreign language translation on the web or a mobile multimedia service platform to provide.

또한, 본 발명은 기존의 클로즈드 캡션 자막과 비교하여, 재사용 하기에 적합한 고품질(high quality)의 클로즈드 캡션 자막을 생성하는 장치를 제공한다.The present invention also provides an apparatus for generating high quality closed caption subtitles suitable for reuse in comparison with existing closed caption subtitles.

일 측면에 있어서, 캡션 자막 보정 장치는 방송의 음성에 대응한 속기(stenography)를 통해 생성된 캡션(caption) 자막 및 상기 방송의 편성 스케줄 정보를 수신하는 수신부, 상기 캡션 자막의 수신 시간에 기초하여, 상기 편성 스케줄 정보에 포함된 방송 프로그램들 중에서 상기 캡션 자막과 대응하는 방송 프로그램을 검색하고, 상기 검색된 방송 프로그램의 정보를 상기 캡션 자막과 매칭하여 자막 파일 및 데이터를 생성하는 자막 데이터 생성부 및 자막 파일의 활용 목적과 상기 방송의 음성 인식 결과에 기초하여, 상기 자막 파일의 각 캡션 자막들의 텍스트 포맷과 시간 동기값을 보정하는 보정부를 포함한다.According to an aspect of the present invention, a caption caption correcting apparatus includes a receiver for receiving a caption caption generated through stenography corresponding to a voice of a broadcast and a composition schedule information of the broadcast, A subtitle data generation unit for searching a broadcast program corresponding to the caption subtitle from the broadcast programs included in the programming schedule information and for generating a subtitle file and data by matching information of the searched broadcast program with the caption subtitle, And a correction unit for correcting a text format and a time synchronization value of each caption subtitles of the subtitles file based on the purpose of utilization of the file and the speech recognition result of the broadcast.

상기 수신부는 속기 데이터를 전용 모뎀을 통하여 수신하며, 상기 속기 데이터의 수신 시간과 텍스트를 로그 형태로 저장하거나 실시간으로 바로 자막 데이터 생성부로 전달할 수 있다. The receiving unit receives the shorthand data via the dedicated modem, and stores the received time and text of the shorthand data in a log form or transmits the received time and text to the caption data generation unit in real time.

상기 자막 데이터 생성부에서 생성한 상기 자막 파일은 해당 프로그램의 자막 텍스트와 각 자막 텍스트가 나타나는 시간 위치인 타임 스탬프를 포함하고 있다. 자막 데이터 생성부에서 생성한 자막데이터는 상기 방송 프로그램의 제목, 상기 방송 프로그램의 방송 시간, 상기 방송 프로그램에 대한 간단한 설명, 상기 방송 프로그램의 고유 식별자, 상기 방송 프로그램의 방송 차수, 상기 캡션 자막의 편집 버전, 자막 파일 형식(예를 들어, SRT, SMI, SUB 등과 같은 다양한 포맷) 및 언어코드 중 적어도 하나를 포함할 수 있다.The caption file generated by the caption data generation unit includes a caption text of the corresponding program and a time stamp which is a time position at which each caption text appears. The caption data generated by the caption data generation unit may include at least one of a title of the broadcast program, a broadcast time of the broadcast program, a brief description of the broadcast program, a unique identifier of the broadcast program, a broadcast degree of the broadcast program, Version, a subtitle file format (e.g., various formats such as SRT, SMI, SUB, etc.), and a language code.

상기 보정부는 상기 방송 프로그램의 동영상에서 음성을 인식하는 음성 인식부 및 상기 인식된 음성의 타임 스탬프와 상기 자막 텍스트의 타임 스탬프를 비교하여, 상기 캡션 자막의 시간 동기를 보정하는 시간 동기 보정부를 포함할 수 있다.Wherein the correcting unit includes a voice recognition unit for recognizing a voice in the moving picture of the broadcast program and a time synchronization correction unit for comparing the time stamp of the recognized voice with the time stamp of the caption text to correct time synchronization of the caption subtitle .

상기 보정부는 상기 캡션 자막의 활용 용도에 맞추어 설정된 포맷으로 상기 캡션 자막의 포맷을 변경하는 포맷 변경부 및 상기 방송 프로그램의 동영상에서 발화되는 음성의 타이밍과 상기 변경된 포맷의 캡션 자막의 텍스트로 구성된 음성 사전을 이용하여, 상기 캡션 자막의 시간 동기를 보정하는 시간 동기 보정부를 포함할 수 있다.Wherein the correcting unit comprises: a format changing unit for changing the format of the caption subtitle in a format set in accordance with the utilization purpose of the caption subtitle; and a speech dictionary composed of a text of a caption subtitle of the changed format, And a time synchronization correction unit for correcting the time synchronization of the caption subtitle using the time synchronization correction unit.

상기 음성 인식부는 음성을 인식할 때 상기 자막 데이터 생성부에서 생성한 자막 파일의 텍스트로 만든 음성 사전에서 단어를 검색하여 매칭된 단어를 음성의 인식 결과로 채택하여, 음성의 발화 시점을 신규 타임 스탬프로 생성할 수 있다. 또한 음성의 한 구간만 인식하고자 할 때는, 상기 자막 데이터 생성부에서 생성한 자막 파일의 텍스트 중 해당 음성 구간의 발화 시점을 기준으로 기 설정된 마진(margin)의 범위 안의 타임 스탬프를 가진 텍스트들만으로 음성 사전을 만든 뒤, 해당 음성 사전의 단어들만 검색하여 매칭된 단어를 음성의 인식 결과로 채택하는 과정을 취할 수 있다. 상기 음성 인식부는 전체 영상에 대한 음성 인식을 수행할 때도 구간 음성 인식 과정을 반복함으로써 음성 인식 시 참조해야 할 사전의 범위를 줄임으로써 전체 음성 인식의 정확도를 높일 수 있다. The speech recognition unit searches for a word in a speech dictionary made up of the text of the caption file generated by the caption data generation unit when the speech is recognized, adopts the matched word as a speech recognition result, Can be generated. In addition, when it is desired to recognize only one section of the speech, only the texts having a time stamp within a predetermined margin based on the utterance timing of the corresponding section of the text of the subtitle file generated by the subtitle data generation section, And then searching for only the words of the speech dictionary and adopting the matched word as the speech recognition result. The speech recognition unit may repeat the speech recognition process even when the speech recognition is performed on the entire image, thereby reducing the range of a dictionary to be referred to in speech recognition, thereby improving the accuracy of the entire speech recognition.

상기 음성 인식부는 음성 외에 음악이 존재하는 구간을 검출할 수 있다. The speech recognition unit can detect a section in which music exists in addition to the voice.

상기 음성 인식부는 방송 프로그램의 동영상에서 화자(speaker)를 구별하고, 화자 별로 발화되는 음성(speech or voice)을 인식할 수 있다.The speech recognition unit may distinguish a speaker from a moving picture of a broadcast program and recognize a speech or voice uttered by a speaker.

상기 시간 동기 보정부는 상기 음성 인식부에서 생성한 신규 타임 스탬프와 원본 자막 파일의 텍스트를 이용함으로써 자막의 시간 동기를 보정할 수 있다. 원본 자막이 포맷 변경부를 통하여 자막의 텍스트 줄 바꿈을 변경한 후 자막의 각 줄의 첫 번째 나타나는 단어에 할당된 신규 타임 스탬프를 해당 줄의 타임 스탬프로서 교체함으로써, 시간 동기를 보정할 수 있다.The time synchronization correction unit may correct the time synchronization of the caption by using the new time stamp generated by the voice recognition unit and the text of the original caption file. The time synchronization can be corrected by replacing the new text timestamp assigned to the first appearing word of each line of the subtitle after changing the text line wrapping of the subtitle through the format changing unit of the original subtitle with the timestamp of the corresponding line.

상기 보정부는 상기 캡션 자막의 텍스트의 맞춤법과 시간 동기 보정 오류를 검사하고 정정하는 정정부 및 상기 방송 프로그램의 발화자 별로 구별되도록, 상기 캡션 자막의 텍스트를 컬러링하는 텍스트 컬러링부를 포함할 수 있다.The correcting unit may include a correcting unit for checking and correcting the spelling of text of the caption subtitle and the time synchronization correction error, and a text coloring unit for coloring the text of the caption subtitle so as to be distinguished by the speaker of the broadcast program.

다른 일 측면에 있어서, 캡션 자막 보정 장치는 폴더에 포함된 상기 방송 프로그램의 영상 파일 이름과 상기 자막 데이터의 이름이 동일하면 상기 자막 데이터에 포함된 캡션 자막의 시간 동기를 보정할 목적으로, 상기 영상 파일 이름과 상기 자막 데이터의 이름을 모니터링하는 폴더 모니터링부를 더 포함할 수 있다.In another aspect, the caption subtitle correction apparatus may further include a caption subtitle correction unit for correcting time synchronization of the caption subtitle included in the caption data if the video file name of the broadcast program included in the folder is the same as the caption data name, And a folder monitoring unit for monitoring the file name and the name of the caption data.

다른 일 측면에 있어서, 캡션 자막 보정 장치는 상기 방송의 편성 스케줄 정보에 기초하여 기 설정된 주기로, 상기 캡션 자막의 시간 동기를 보정할 목적으로 상기 편성 스케줄을 모니터링하는 스케줄 모니터링부를 더 포함할 수 있다.In another aspect, the caption subtitle correction apparatus may further include a schedule monitoring unit for monitoring the composition schedule for the purpose of correcting time synchronization of the caption subtitles at a predetermined period based on the program scheduling information of the broadcast.

다른 일 측면에 있어서, 캡션 자막 보정 장치는 상기 보정부가 특정 프로그램에서는 동작하지 않도록, 상기 방송의 편성 스케줄 정보에서 상기 특정 프로그램을 필터링하는 필터링부를 더 포함할 수 있다.In another aspect, the caption subtitle correction apparatus may further include a filtering unit for filtering the specific program in the program schedule information so that the correction unit does not operate in the specific program.

다른 일 측면에 있어서, 캡션 자막 보정 장치는 사용자의 필요에 따라, 자막의 모든 줄의 시간 동기 및 문장 길이를 일괄적으로 변경하거나 개별 줄들의 병합, 분할, 타임스탬프를 변경하는 수동 보정부를 더 포함할 수 있다.In another aspect, the caption subtitle correction apparatus further includes a manual correction unit for collectively changing the time synchronization and the sentence length of all the lines of the subtitles, or merging, dividing and changing the time stamps of individual lines according to the user's need can do.

일 측면에 있어서, 캡션 자막 보정 방법은 방송의 음성에 대응한 속기(stenography)를 통해 생성된 캡션(closed caption) 자막 및 상기 방송의 편성 스케줄 정보를 수신하는 단계, 상기 캡션 자막의 수신 시간에 기초하여, 상기 편성 스케줄 정보에 포함된 방송 프로그램들 중에서 상기 캡션 자막과 대응하는 방송 프로그램을 검색하는 단계, 상기 검색된 방송 프로그램의 정보를 상기 캡션 자막과 매칭하여 자막 파일 및 데이터를 생성하는 단계 및 상기 방송의 음성 인식 결과에 기초하여, 상기 자막 파일의 각 캡션 자막들의 텍스트 포맷과 시간 동기값을 보정하는 단계를 포함한다.According to an aspect of the present invention, there is provided a caption subtitle correction method, comprising: receiving a caption closed caption generated through stenography corresponding to a speech of a broadcast and schedule scheduling information of the broadcast; Searching for a broadcast program corresponding to the caption subtitle from the broadcast programs included in the program schedule information, generating subtitle file and data by matching information of the searched broadcast program with the caption subtitle, And correcting the text format and the time synchronization value of each of the caption subtitles of the caption file based on the speech recognition result of the caption file.

상기 보정하는 단계는 상기 방송 프로그램의 동영상에서 음성을 인식하는 단계 및 상기 인식된 음성의 타임 스탬프와 상기 자막 데이터의 타임 스탬프를 비교하여, 상기 캡션 자막의 시간 동기를 보정하는 단계를 포함할 수 있다.The step of correcting may include a step of recognizing the voice in the moving picture of the broadcast program and a step of comparing the time stamp of the recognized voice with the time stamp of the caption data to correct time synchronization of the caption subtitle .

상기 보정하는 단계는 상기 캡션 자막의 활용 용도에 맞추어 설정된 포맷으로 상기 캡션 자막의 포맷을 변경하는 단계, 상기 방송 프로그램의 동영상에서 음성을 인식하는 단계 및 상기 캡션 자막의 각 줄의 첫 번째 나타나는 단어에 할당된 신규 타임 스탬프를 음성 인식을 통해 생성된 신규 타임 스탬프로 교체함으로써, 시간 동기를 보정하는 하는 단계를 포함할 수 있다.Wherein the step of correcting comprises: changing the format of the caption subtitle in a format set in accordance with the utilization purpose of the caption subtitle; recognizing a voice in the moving picture of the broadcast program; And correcting the time synchronization by replacing the assigned new time stamp with a new time stamp generated through speech recognition.

본 발명은 클로즈드 캡션 자막을 프로그램 단위의 파일 형태로 저장하고, 해당 자막 파일 캡션들의 텍스트 포맷을 설정에 따라 다양한 형태로 변경하고, 시간 동기를 보정함으로써, 디지털 텔레비전(Digital Television, DTV) 재방송, 케이블, 위성 플랫폼에서의 방송 프로그램 재사용 또는 웹이나 모바일 멀티미디어 서비스 플랫폼에서의 주문형 비디오(Video On Demand, VOD) 자막 또는 외국어 번역 초벌 생성 등과 같은 다양한 용도의 어플리케이션에 이용되는 캡션 자막을 생성하는 장치를 제공할 수 있다.The present invention stores a closed caption subtitle in a file format of a program unit, changes the text format of the caption file captions to various forms according to the setting, corrects time synchronization, and reproduces digital television (DTV) , A device for generating caption subtitles for use in a variety of applications such as reusing broadcast programs on a satellite platform or creating video on demand (VOD) subtitles or foreign language translation primitives on the web or a mobile multimedia service platform .

또한, 본 발명은 기존의 클로즈드 캡션 자막과 비교하여, 재사용 하기에 적합한 고품질(high quality)의 클로즈드 캡션 자막을 생성하는 장치를 제공할 수 있다.In addition, the present invention can provide an apparatus for generating high quality closed caption subtitles suitable for reuse in comparison with existing closed caption subtitles.

도 1은 본 발명의 일실시예에 따른 캡션 자막 보정 장치의 블록도이다.
도 2는 본 발명의 제1 실시예에 따른 보정부의 블록도이다.
도 3은 본 발명의 제2 실시예에 따른 보정부의 블록도이다.
도 4는 본 발명의 제3 실시예에 따른 보정부의 블록도이다.
도 5는 본 발명의 일실시예에 따른 캡션 자막 보정 장치가 사용되는 시스템을 나타낸 도면이다.
도 6은 본 발명의 일실시예에 따른 캡션 자막 보정 방법의 흐름도이다.
도 7a 내지 도 7c는 본 발명의 일실시예에 따라 음성사전을 이용하여 음성 인식하는 예들을 나타낸 도면이다.1 is a block diagram of a caption subtitle correction apparatus according to an embodiment of the present invention.
2 is a block diagram of a correction unit according to the first embodiment of the present invention.
3 is a block diagram of a correction unit according to a second embodiment of the present invention.
4 is a block diagram of a correction unit according to a third embodiment of the present invention.
FIG. 5 illustrates a system in which a caption caption correction apparatus according to an embodiment of the present invention is used.
6 is a flowchart of a caption subtitle correction method according to an embodiment of the present invention.
7A to 7C are views illustrating examples of speech recognition using a speech dictionary according to an embodiment of the present invention.

이하, 본 발명의 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

한국에서는 법적으로 지상파 방송에서, 클로즈드 캡션 서비스를 제공할 의무가 있으며, 현재 실시간 속기(stenography) 방식으로 제작되고 있다. 그러나, 실시간 방송 후에도 다른 서비스 플랫폼들에서 클로즈드 캡션 서비스의 제공을 필요로 하는 경우가 증가하고 있다. 위와 같은 필요에 대응하여, 본 발명에서는 기존의 클로즈드 캡션 자막을 효율적으로 재사용하기 위해, 클로즈드 캡션 자막 파일을 생성하고 다양한 서비스 시스템의 어플리케이션들에 제공될 수 있는 고품질의 자막 파일로 보정/변환하는 방법을 제공할 수 있다. 본 발명은 음성 인식(speech recognition)을 사용하여 자동으로 클로즈드 캡션 자막의 시간 동기를 보정함으로써, 수동으로 보정이 필요한 부분을 최소화할 수 있다.
In Korea, it is legally obliged to provide closed caption service in terrestrial broadcasting, and it is being produced in real-time stenography method. However, even after real-time broadcasting, it is increasingly necessary to provide closed caption service in other service platforms. In order to efficiently reuse existing closed caption subtitles, the present invention creates a closed caption subtitles file and corrects / transforms the closed caption subtitles file into a high-quality subtitle file that can be provided to applications of various service systems Can be provided. The present invention can automatically correct the time synchronization of the closed caption caption using speech recognition, thereby minimizing manual correction.

도 1은 본 발명의 일실시예에 따른 캡션 자막 보정 장치의 블록도이다.1 is a block diagram of a caption subtitle correction apparatus according to an embodiment of the present invention.

도 1을 참조하면, 일실시예에 따른 캡션 자막 보정 장치는 수신부(110), 자막 데이터 생성부(120) 및 보정부(130)를 필수적 구성으로 포함할 수 있다. 다른 일실시예에 따른 캡션 자막 보정 장치는 추가적 구성으로서, 폴더 모니터링부(140), 스케줄 모니터링부(150), 필터링부(160), 수동 보정부(170) 및 중앙관리시스템(180)을 포함할 수 있다.Referring to FIG. 1, the caption caption correcting apparatus according to an exemplary embodiment of the present invention may include a receiving unit 110, a caption data generating unit 120, and a correcting unit 130 as essential components. The caption subtitle correction apparatus according to another embodiment includes a folder monitoring unit 140, a schedule monitoring unit 150, a filtering unit 160, a manual correction unit 170, and a central management system 180 can do.

수신부(110)는 방송의 음성에 대응한 속기(stenography)를 통해 생성된 캡션(caption) 자막 및 방송의 편성 스케줄 정보를 수신한다. 여기서, 캡션 자막은 장애인을 위한 클로즈드 캡션 자막을 의미한다. 캡션 자막은 방송 프로그램에서 음성에 대응하는 모든 텍스트를 포함한다. The receiving unit 110 receives caption caption and broadcasting schedule information generated through stenography corresponding to the audio of the broadcast. Here, the caption subtitle means a closed caption subtitle for the disabled. The caption subtitle includes all the text corresponding to the voice in the broadcast program.

실시간으로 속기에 의하여 생성된 자막 데이터가 방송사로 연결된 전용 모뎀을 통하여 전달되면 수신부(110)는 해당 데이터가 DTV 인서터(inserter)와 인코더(encoder)로 전달되기 전 단계에 분기하여, 자막 수신 시간과 텍스트를 로그 형태로 저장하거나 실시간으로 바로 자막데이터 생성부(120)로 전달할 수 있다. 방송의 편성 스케줄 정보는 APC(Automatic Program Control)에 의하여 통제되므로 APC는 편성 정보 변경 내역과 정보를 가장 정확하게 반영한다. 수신부(110)는 APC와 연결된 스케줄 인터페이스로부터 방송의 편성 스케줄 정보를 수신할 수 있다. 수신부(110)는 SOAP(Simple Object Access Protocol) 또는 Restful 웹 서비스 방식으로 스케줄 인터페이스로부터 편성 스케줄 정보를 수신할 수 있다. 예를 들어, 방송의 편성 스케줄 정보는 시간 대 별로 스케줄링 된 방송 프로그램에 대한 제목, 방송 시간, 방송 프로그램의 고유 식별자(Program ID), 방송 프로그램의 간단한 설명, 방송 프로그램의 회차 등을 포함할 수 있다. If the caption data generated by the shorthanding in real time is transmitted through the dedicated modem connected to the broadcasting company, the receiving unit 110 branches to the stage before the corresponding data is transmitted to the DTV inserter and the encoder, And the text may be stored in a log form or transmitted to the caption data generation unit 120 in real time. Broadcast scheduling information is controlled by APC (Automatic Program Control), so APC most accurately reflects changes in schedule information and information. The receiving unit 110 may receive broadcast scheduling information from a schedule interface connected to the APC. The receiving unit 110 may receive the scheduling information from the scheduling interface in a SOAP (Simple Object Access Protocol) or a Restful web service scheme. For example, the broadcast schedule schedule information may include a title, a broadcast time, a unique identifier (Program ID) of a broadcast program, a brief description of a broadcast program, .

자막 데이터 생성부(120)는 캡션 자막의 수신 시간에 기초하여, 편성 스케줄 정보에 포함된 방송 프로그램들 중에서 캡션 자막과 대응하는 방송 프로그램을 검색하고, 검색된 방송 프로그램의 정보를 캡션 자막과 매칭하여 자막 파일을 생성할 수 있다. 자막 파일은 일반적으로 유통되는 SAMI, SRT, SUB 등의 다양한 자막 파일 형식으로 지정될 수 있으며, 수신부(110)에서 자막 텍스트가 수신된 시간과 프로그램 편성 정보를 참고하여, 프로그램 단위의 자막 파일로 완성된다. The caption data generation unit 120 searches the broadcast program corresponding to the caption caption among the broadcast programs included in the composition schedule information on the basis of the caption caption reception time and matches the information of the searched broadcast program with the caption caption, File can be generated. The caption file can be specified in various types of caption file, such as SAMI, SRT, and SUB, which are generally distributed. When the caption text is received in the receiving unit 110, do.

또한, 자막 파일에 대한 자막데이터(또는 다른 이름으로 메타 데이터)는 방송 프로그램의 제목, 방송 시간, 방송 프로그램에 대한 간단한 설명, 방송 프로그램의 고유 식별자(Program ID), 방송 프로그램의 방송 차수, 캡션 자막의 편집 버전, 자막 파일 형식, 문자 인코딩 형식 및 캡션 자막의 언어코드 중 적어도 하나를 포함할 수 있다. 캡션 자막의 언어코드는 캡션 자막이 한글, 영어 또는 기타 다른 언어인지를 언어별로 설정된 코드로 나타낸다. 자막 데이터 생성부(120)는 생성된 자막 파일과 자막데이터를 중앙 관리 시스템(central management system)(180)에 업로드할 수 있다. Further, the caption data (or metadata in other names) of the caption file may include at least one of a title of the broadcast program, a broadcast time, a brief description of the broadcast program, a unique identifier (Program ID) of the broadcast program, An edited version of the caption file format, a character encoding format, and a language code of the caption subtitle. Caption The language code of the subtitle indicates whether the caption caption is Korean, English or any other language. The caption data generation unit 120 may upload the generated caption file and the caption data to the central management system 180. [

보정부(130)는 방송 영상의 음성 인식 결과에 기초하여, 자막 데이터 생성부(120)에서 생성된 자막 파일의 캡션 자막의 시간 동기를 보정할 수 있다. 보정부(130)는 캡션 자막과 매칭되는 방송 프로그램의 영상에서 인식되는 음성(speech)의 타이밍과 캡션 자막의 타임 스탬프(timestamp)가 일치하도록, 캡션 자막의 시간 동기를 보정할 수 있다. 방송 프로그램의 영상에서 음성을 인식하는 방식은, 현재까지 통상의 기술자에게 알려진 다양한 종류의 음성 인식 방식이 사용될 수 있다. 또한 보정부(130)는 자막 파일의 활용 목적에 따라 텍스트 포맷을 변경하고, 텍스트 맞춤법이나 보정 결과의 정확도를 검수하며, 화자에 따라 텍스트 색상이 다르게 디스플레이 되도록 할 수 있다. The correcting unit 130 can correct the time synchronization of the caption subtitle of the subtitle file generated by the subtitle data generator 120 based on the speech recognition result of the broadcast image. The correcting unit 130 may correct the time synchronization of the caption subtitle so that the timing of the speech recognized in the video image of the broadcast program matched with the caption subtitle matches the time stamp of the caption subtitle. As a method of recognizing a voice in a video image of a broadcast program, various types of voice recognition methods known to those of ordinary skill in the art can be used. Also, the correction unit 130 may change the text format according to the purpose of using the subtitle file, check the accuracy of the text spelling or correction result, and display the text color differently according to the speaker.

폴더 모니터링부(140)는 사전 등록한 폴더에 포함된 방송 프로그램의 영상 파일 이름과 자막 파일의 이름이 동일하면, 보정부(130)에서 보정 작업을 자동으로 시작할 목적으로, 영상 파일 이름과 자막 파일의 이름을 모니터링할 수 있다. 또는 신규 영상 파일만 모니터링하여, 영상 파일의 이름에 방송 프로그램의 고유 식별자(Program ID)가 포함되어 있으면, 중앙 관리 시스템(180)의 인터페이스를 통하여 고유 식별자(Program ID)로 해당 영상의 자막을 요청하고, 다운로드 받은 뒤 작업을 시작할 수 있다. If the video file name of the broadcasting program included in the pre-registered folder and the name of the subtitle file are the same, the folder monitoring unit 140 monitors the video file name and the subtitle file for the purpose of automatically starting the correction operation in the correcting unit 130 You can monitor the name. If the unique identifier (Program ID) of the broadcast program is included in the name of the video file, the caption of the corresponding video is requested with a unique identifier (Program ID) through the interface of the central management system 180 After downloading, you can start to work.

폴더 모니터링을 위하여 신규 폴더를 등록할 때는 자동 작업의 방식여부 또한 미리 지정할 수 있다. 즉, 모니터링의 대상이 되는 폴더마다 보정 대상 자막의 향후 활용 목적에 부합하는 문장 포맷 방식을 미리 설정하여, 신규 작업 대상 유입 시 해당 설정에 맞춰 폴더 모니터링부(140)가 보정부(130)에 자동 작업 수행을 요청할 수 있다. 즉, DTV용 자막, 가로 40byte와 세로 2줄의 VOD 자막, 외국어 번역 초본을 위한 완결된 문장 등의 방식을 신규 모니터링 폴더 등록 시에 같이 지정할 수 있다.When registering a new folder for folder monitoring, it is also possible to specify in advance whether the automatic operation is performed. That is, a sentence formatting method corresponding to the purpose of future use of the subtitle to be corrected is set in advance for each folder to be monitored, and when the new work target is inputted, the folder monitoring unit 140 automatically You can request to perform a task. That is, it is possible to designate a method such as subtitle for DTV, VOD subtitles of 40 bytes width and 2 lines, and completed sentences for foreign language translation, at the time of registering a new monitoring folder.

스케줄 모니터링부(150)는 방송의 편성 스케줄 정보에 기초하여 기 설정된 주기로, 캡션 자막의 시간 동기를 보정할 목적으로 편성 스케줄을 모니터링하고 작업 스케줄링을 할 수 있다. 보정부(130)가 자동 작업을 시작하기 위한 방법으로 상기 폴더 모니터링부(140)를 사용하지 않고, 스케줄 모니터링 방식만을 활용할 수 있다. 예를 들어, 시스템 운용자가 스케줄 모니터링부(150)에서 제공하는 편성표를 열람한 뒤 작업 프로그램 대상과 순서를 지정하면, 스케줄 모니터링부(150)는 해당 시간에 편성된 작업 대상의 고유 식별자(Program ID)를 이용하여 자막 파일을 중앙 관리 시스템(180)에 요청하고, 영상 파일을 저장소에서 요청하여 보정부(130)가 작업을 시작하게 할 수 있다.The schedule monitoring unit 150 may monitor the scheduling schedule and perform the task scheduling for the purpose of correcting the time synchronization of the caption subtitles at a predetermined period based on the broadcast scheduling information. The correcting unit 130 can utilize only the schedule monitoring method without using the folder monitoring unit 140 as a method for starting automatic operation. For example, when the system operator views the schedule table provided by the schedule monitoring unit 150 and designates a task program target and an order, the schedule monitor 150 acquires a unique identifier (Program ID ) To the central management system 180, and requests the image file from the repository to allow the correction unit 130 to start the operation.

필터링부(160)는 보정부(130)가 특정 프로그램에서는 동작하지 않도록, 방송의 편성 스케줄 정보에서 특정 프로그램을 필터링할 수 있다. 예를 들어, 음악 방송 또는 콘서트를 중계하는 방송 같은 경우, 음성과 음악의 구별이 어려워, 보정부(130)의 동작의 정확도가 떨어질 수 있다. 따라서, 특정 프로그램은 필터링 될 수 있다. 특정 프로그램은 사용자의 설정에 따라 다양하게 조절될 수 있다. 예를 들어 특정 프로그램 코드(Program Code)를 필터링 등록하면 해당 프로그램 코드 하위의 고유 식별자를 가진 파일이, 폴더 모니터링부(140)에서 등록한 폴더에서 신규로 감지되어도 보정부(130)로 작업 요청이 되지 않는다. The filtering unit 160 may filter a specific program from the broadcast schedule schedule information so that the corrector 130 does not operate in a specific program. For example, in the case of a music broadcast or a broadcast relaying a concert, it is difficult to distinguish between voice and music, and the accuracy of the operation of the corrector 130 may be deteriorated. Thus, a particular program can be filtered. The specific program can be adjusted variously according to the user's setting. For example, if a specific program code is filtered and registered, even if a file having a unique identifier under the program code is newly detected in the folder registered in the folder monitoring unit 140, the job is not requested to the correction unit 130 Do not.

수동 보정부(170)는 사용자의 필요에 따라, 캡션 자막 중 사용자로부터 입력된 시간 동기에 기초하여, 캡션 자막의 시간 동기를 보정할 수 있다. 캡션 자막의 전체 또는 구간 또는 문장 별로, 사용자의 입력에 기초하여 타임 스탬프가 조정될 수 있다. 시간 동기뿐만 아니라 문장들의 길이나 병합, 분할 등도 수동으로 조정될 수 있다. 수동 보정부(170)에서는 보정부(130)에서 정정된 맞춤법과 타임스탬프 정정 내역들만 표시하여 사용자가 정정된 내용을 확인하고 변경할 수 있다.The manual correction unit 170 can correct the time synchronization of the caption subtitles based on the time synchronization input from the user among the caption subtitles, according to the needs of the user. The timestamp can be adjusted based on the user's input, either for the entire caption subtitle, or for each section or sentence. Not only time synchronization, but also the length, merging, and segmentation of sentences can be manually adjusted. In the passive correction unit 170, only the corrected spelling and timestamp correction details are corrected by the correction unit 130 so that the user can confirm and change the corrected contents.

보정부(130)에서 보정된 캡션 자막은 중앙관리시스템(180)으로 전달될 수 있다. 수동 보정부(170)에서 보정된 캡션 자막도 중앙관리시스템(180)으로 전달될 수 있다. 중앙관리시스템(180)은 보정된 캡션 자막을 관리할 수 있다. 중앙관리시스템(180)은 다양한 주체(entity)의 요구에 따라 저장된 캡션 자막을 제공할 수 있다.
The corrected caption subtitles in the correction unit 130 may be transmitted to the central management system 180. [ The corrected caption subtitles in the manual correction unit 170 can also be transmitted to the central management system 180. [ The central management system 180 can manage the corrected caption subtitles. The central management system 180 may provide stored caption subtitles according to the needs of various entities.

도 2는 본 발명의 제1 실시예에 따른 보정부의 블록도이다.2 is a block diagram of a correction unit according to the first embodiment of the present invention.

도 2를 참조하면, 일실시예에 따른 보정부(130A)는 음성 인식부(210) 및 시간동기 보정부(220)를 포함할 수 있다.Referring to FIG. 2, a correction unit 130A according to an embodiment may include a voice recognition unit 210 and a time synchronization correction unit 220. FIG.

음성 인식부(210)는 방송 프로그램의 동영상에서 음성을 인식할 수 있다. 음성 인식부(210)는 음향음성학적 방식, 패턴인식 방식, 뉴럴 네트워크 방식, SVM(Support Vector Machine) 방식, 인공지능 방식 등을 이용할 수 있다. 예를 들어, 패턴인식 방식에는 템플릿, DTW(Dynamic Time Warping), VQ(Vector Quantification), HMM(Hidden Markov Model)을 사용하는 통계적 기법 등이 포함될 수 있다.The voice recognition unit 210 can recognize the voice in the moving picture of the broadcast program. The speech recognition unit 210 may use an acoustic phonetic method, a pattern recognition method, a neural network method, a SVM (Support Vector Machine) method, an artificial intelligence method, or the like. For example, pattern recognition methods may include templates, statistical techniques using Dynamic Time Warping (DTW), Vector Quantization (VQ), and HMM (Hidden Markov Model).

음성 인식부(210)는 방송 프로그램의 동영상에서 화자(speaker)를 구별하고, 화자 별로 발화되는 음성(speech or voice)을 인식할 수 있다.The speech recognition unit 210 can identify a speaker in a moving image of a broadcast program and recognize a speech or voice uttered by a speaker.

음성 인식부(210)는 사람의 음성만 존재하지 않고 배경 음악이 존재하는 구간을 검출할 수 있다. The speech recognition unit 210 can detect a section in which there is no human voice but a background music.

음성 인식부(210)는 음성을 인식할 때 도 1의 자막 데이터 생성부(120)에서 생성한 자막 파일의 텍스트로 만든 음성 사전에서 단어를 검색하여 매칭된 단어를 음성의 인식 결과로 채택하여, 음성의 발화 시점을 신규 타임 스탬프로 생성할 수 있다. When recognizing the speech, the speech recognition unit 210 searches for a word in a speech dictionary made up of the text of the caption file generated by the caption data generation unit 120 of FIG. 1, adopts the matched word as a speech recognition result, The speech time point of the speech can be generated as a new time stamp.

일 예로, 음성 인식부(210)는 음성의 한 구간만 인식하고자 할 때는, 자막 데이터 생성부(120)에서 생성한 자막 파일의 텍스트 중 해당 음성 구간의 발화 시점을 기준으로, 기 설정된 마진(margin)의 범위 안의 타임 스탬프를 가진 텍스트들만으로 음성 사전을 만든 뒤, 해당 음성 사전의 단어들만 검색하여 매칭된 단어를 음성의 인식 결과로 채택하는 과정을 취할 수 있다. For example, when recognizing only one section of a speech, the speech recognition unit 210 extracts a predetermined margin from the text of the subtitle file generated by the subtitle data generation unit 120, ), The speech dictionary is created using only the texts having the time stamps in the range of " 0 " to " 0 ", and only the words of the speech dictionary are searched to adopt the matched words as speech recognition results.

또한 다른 일 예로, 음성 인식부(210)는 전체 영상에 대한 음성 인식을 수행할 때도 구간 음성 인식 과정을 반복함으로써 음성 인식 시 참조해야 할 사전의 범위를 줄임으로써 전체 음성 인식의 정확도를 높일 수 있다.In another example, the speech recognition unit 210 can increase the accuracy of the entire speech recognition by reducing the range of dictionaries to be referred to in speech recognition by repeating the segment speech recognition process even when performing speech recognition on the entire image .

시간동기 보정부(220)는 음성 인식부(210)에서 인식된 음성의 타임 스탬프와 자막 데이터의 타임 스탬프를 비교하여, 자막 데이터에 포함된 캡션 자막의 시간 동기를 보정할 수 있다. 시간동기 보정부(220)는 캡션 자막의 타임 스탬프를 인식된 음성의 타임 스탬프로 보정함으로써, 시간 동기를 보정할 수 있다.The time synchronization corrector 220 may compare the time stamp of the voice recognized by the voice recognition unit 210 with the time stamp of the caption data to correct the time synchronization of the caption caption included in the caption data. The time synchronization corrector 220 can correct the time synchronization by correcting the time stamp of the caption subtitle to the time stamp of the recognized voice.

또한, 시간 동기 보정부(220)는 음성 인식부(210)에서 인식된 음성의 타임 스탬프를 기준으로 기 설정된 마진(margin)의 범위에서, 음성 사전에서 매칭되는 부분에 기초하여, 구간 보정을 수행할 수 있다. 음성 사전에는 미리 음성 인식을 통해 인식된 텍스트가 타임 스탬프와 함께 저장되어 있을 수 있다.
The time synchronization correcting unit 220 performs the interval correction based on the portion matched in the speech dictionary in a predetermined margin range based on the time stamp of the speech recognized by the speech recognition unit 210 can do. In the speech dictionary, text recognized in advance by speech recognition may be stored together with a time stamp.

도 3은 본 발명의 제2 실시예에 따른 보정부의 블록도이다.3 is a block diagram of a correction unit according to a second embodiment of the present invention.

도 3을 참조하면, 일실시예에 따른 보정부(130B)는 포맷 변경부(310), 음성 인식부(320) 및 시간동기 보정부(330)를 포함할 수 있다.Referring to FIG. 3, the correction unit 130B according to an exemplary embodiment may include a format changing unit 310, a voice recognizing unit 320, and a time synchronization correcting unit 330. FIG.

포맷 변경부(310)는 캡션 자막의 용도에 따라 설정된 포맷으로 캡션 자막의 포맷을 변경할 수 있다. 자막 데이터는 디지털 텔레비전(Digital Television, DTV), 웹 상의 멀티미디어, 모바일 멀티미디어, 주문형 비디오(Video On Demand, VOD) 등과 같은 다양한 용도의 어플리케이션에 사용될 수 있다. 그런데, 어플리케이션 별로, 캡션 자막이 디스플레이 되는 방식 또는 캡션 자막이 화면 상에서 한 줄에 표시될 수 있는 공간(예를 들어, 캡션 자막에 할당된 폭과 높이)이 다르기 때문에 포맷의 변경이 필요하다.The format changing unit 310 can change the format of the caption subtitle in the format set in accordance with the use of the caption subtitle. The caption data can be used for various applications such as digital television (DTV), multimedia on the web, mobile multimedia, video on demand (VOD), and the like. However, it is necessary to change the format because the caption subtitle is displayed for each application or the space where the caption subtitle can be displayed on one line on the screen (for example, the width and height allocated to the caption subtitle) is different.

표 1에 각 어플리케이션에 따른 캡션 자막의 포맷이 표시되어 있다.Table 1 shows the format of the caption subtitle according to each application.

[표 1][Table 1]

캡션 자막이 디스플레이 되는 방식(예를 들어, 롤 업(roll-up) 방식 또는 팝 온(pop-on) 방식) 및 화면의 한 줄(line) 당 몇 바이트 기준으로 캡션 자막이 구분되어야 하는지에 대한 정보가 어플리케이션의 종류 별로 미리 설정될 수 있다. 예를 들면 외국어 번역 초벌용으로 생성되는 자막은 언어마다 어순이 다른 점을 고려하여 중간에 분리됨이 없이 완결된 문장 단위로 포맷팅 되어야 한다. 포맷 변경부(310)는 어플리케이션 별로 설정된 정보에 기초하여, 캡션 자막의 디스플레이 방식 및 라인 포맷을 변경할 수 있다.It is important to note how the caption subtitles are displayed (e.g., roll-up or pop-on) and how many bytes per line of the screen Information can be preset for each type of application. For example, the subtitles generated by the foreign language translation primitive should be formatted in a complete sentence unit without separation in the middle, taking into account the difference in word order between languages. The format changing unit 310 can change the display format of the caption subtitle and the line format based on the information set for each application.

음성 인식부(320)는 음성을 인식할 때 도 1의 자막 데이터 생성부(120)에서 생성한 자막 파일의 텍스트로 만든 음성 사전에서 단어를 검색하여 매칭된 단어를 음성의 인식 결과로 채택하여, 음성의 발화 시점을 신규 타임 스탬프로 생성할 수 있다. When recognizing the speech, the speech recognition unit 320 searches for a word in a speech dictionary made up of the text of the caption file generated by the caption data generation unit 120 of FIG. 1, adopts the matched word as a speech recognition result, The speech time point of the speech can be generated as a new time stamp.

음성 인식부(320)는 도 2에서 설명된 예의 방식을 수행할 수 있다.The voice recognition unit 320 may perform the exemplary method illustrated in FIG.

시간동기 보정부(330)는 동영상에서 인식되는 음성의 타임 스탬프와 자막 데이터의 타임 스탬프를 비교하여, 자막 데이터에 포함된 변경된 포맷의 캡션 자막의 시간 동기를 보정할 수 있다.The time synchronization corrector 330 may compare the time stamp of the voice recognized in the moving picture with the time stamp of the caption data to correct the time synchronization of the caption subtitle of the changed format included in the caption data.

시간동기 보정부(330)는 방송 프로그램의 동영상에서 발화되는 음성의 타이밍과 포맷 변경부(310)에서 변경된 포맷의 캡션 자막의 텍스트로 구성된 음성 사전의 타임 스탬프를 비교하여, 캡션 자막의 시간 동기를 보정할 수 있다. 예를 들어, 음성 사전은 변경된 포맷의 캡션 자막의 텍스트로 구성될 수 있다. 시간동기 보정부(330)는 발화되는 음성의 타이밍에 기초하여, 음성 사전에서 상기 타이밍과 매칭되는 구간을 검색하고, 발화된 음성과 매칭되는 부분을 검색하여, 시간 동기를 보정할 수 있다.
The time synchronization corrector 330 compares the time stamps of the speech uttered by the moving picture of the broadcast program with the timestamps of the speech dictionary composed of the text of the caption subtitle of the changed format in the format changing unit 310, Can be corrected. For example, the speech dictionary may consist of text of caption subtitles in a modified format. Based on the timing of the voice to be uttered, the time synchronization correcting unit 330 searches for a section matched with the timing in the voice dictionary, and searches for a portion matched with the uttered voice to correct time synchronization.

도 4는 본 발명의 제3 실시예에 따른 보정부의 블록도이다.4 is a block diagram of a correction unit according to a third embodiment of the present invention.

도 4를 참조하면, 일실시예에 따른 보정부(130C)는 포맷 변경부(410), 음성 인식부(420), 시간 동기 보정부(430), 텍스트 컬러링부(440) 및 정정부(450)를 포함할 수 있다.4, the correcting unit 130C according to one embodiment includes a format changing unit 410, a voice recognizing unit 420, a time synchronizing correcting unit 430, a text coloring unit 440, and a correcting unit 450 ).

포맷 변경부(410), 음성 인식부(420) 및 시간동기 보정부(430)에 대한 설명은 도 3의 설명이 그대로 적용될 수 있다.The description of the format changing unit 410, the voice recognizing unit 420 and the time synchronization correcting unit 430 may be applied as is in the description of FIG.

텍스트 컬러링부(440)는 방송 프로그램의 발화자 별로 캡션 자막이 구별되도록, 캡션 자막의 텍스트를 컬러링할 수 있다. 예를 들어, 발화자 별로 텍스트가 구별되도록, 노란색, 흰색, 빨간색 등으로 텍스트에 컬러가 입혀질 수 있다.The text coloring unit 440 may color the text of the caption subtitles so that the caption subtitles are distinguished for each speaker of the broadcast program. For example, the text may be colored in yellow, white, red, or the like so that text is distinguished by the speaker.

정정부(450)는 캡션 자막의 텍스트의 맞춤법을 검사하고, 오류를 정정할 수 있다. 정정부(450)는 텍스트의 맞춤법 사전에 기초하여 캡션 자막의 맞춤법을 검사하고, 오류가 발생한 부분을 정정할 수 있다. 또한 정정부(450)는 시간동기 보정부(430)에서 보정한 타임 스탬프의 차분과 음성 인식부(420)에서 검출한 음악 구간 정보를 이용하여 보정 오류라고 판별된 경우, 추가로 타임 스탬프를 정정할 수 있다.The correcting unit 450 can check the spelling of the text of the caption subtitle and correct the error. The correcting unit 450 can check the spelling of the caption subtitles based on the text spelling dictionary, and correct the erroneous portion. In addition, when it is determined that the correction error is a correction error using the time stamp difference corrected by the time synchronization correction unit 430 and the music interval information detected by the voice recognition unit 420, the correction unit 450 further corrects the time stamp can do.

도 5는 본 발명의 다른 일실시예에 따른 캡션 자막 보정 장치가 사용되는 시스템을 나타낸 도면이다. 도 5는, 방송국에서 사용될 수 있는 모델의 명칭 및 새로운 장비를 사용하여 기존의 장비와 호환되는 관점에서 실시 예를 설명하였다. FIG. 5 illustrates a system in which a caption caption correcting apparatus according to another embodiment of the present invention is used. FIG. 5 illustrates an embodiment in terms of the name of a model that can be used in a broadcasting station and the compatibility with existing equipment using new equipment.

도 5를 참조하면, 시스템은 자막 모뎀(510), 캡션 자막 보정 장치(520), APC(Automatic Program Control)(530), 중앙 관리 시스템(540) 및 대내외 서비스(550)를 포함할 수 있다.5, the system may include a caption modem 510, a caption subtitle correction unit 520, an APC (Automatic Program Control) 530, a central management system 540, and an external service 550.

자막 모뎀 (510)는 방송의 음성에 대응한 속기(stenography)를 통해 생성된 캡션(caption) 자막을 수신하여 DTV 자막 인서터와 인코더에 전달할 수 있다. The caption modem 510 may receive the caption caption generated through stenography corresponding to the audio of the broadcast and transmit the caption caption to the DTV caption inserter and the encoder.

캡션 자막 보정 장치(520)는 도 1 내지 도 4에서 설명된 방식으로 자막 모뎀(510)으로부터 수신하는 캡션 자막을 보정할 수 있다. 캡션 자막 보정 장치(520)는 캡션 자막의 시간 동기를 실제 방송의 음성과 일치하도록 보정하고, 캡션 자막의 포맷을 어플리케이션 별로 변경함으로써, 어플리케이션에 재사용이 용이하도록 고품질의 자막 데이터를 생성할 수 있다.The caption subtitle correction apparatus 520 can correct the caption subtitle received from the caption modem 510 in the manner described in FIGS. The caption subtitle correcting apparatus 520 can correct the time synchronization of the caption subtitle to match the actual voice of the broadcast and change the format of the caption subtitle for each application to generate high quality subtitle data for easy reuse in applications.

APC(Automatic Program Control)(530)는 방송 프로그램의 편성 스케줄 정보를 캡션 자막 보정 장치(520)로 제공할 수 있다. 캡션 자막 보정 장치(520)는 편성 스케줄 정보를 수신하여, 보정된 캡션 자막을 포함하는 자막 파일 및 자막 데이터를 생성할 수 있다. The APC (Automatic Program Control) 530 may provide the program scheduling information to the caption subtitle compensating apparatus 520. The caption subtitle correction apparatus 520 receives the composition schedule information, and can generate the caption file and the caption data including the corrected caption subtitle.

중앙 관리 시스템(Central Management System, 이하 CMS)(540)은 캡션 자막 보정 장치(520)에서 생성된 자막 데이터를 저장할 수 있다. CMS(540)는 대내외 서비스(550)로 자막 데이터를 제공하여, 대내외 서비스(550)에서 자막 데이터를 재사용하도록 할 수 있다. A Central Management System (CMS) 540 may store the caption data generated by the caption caption correcting apparatus 520. The CMS 540 may provide the caption data to the internal / external service 550 so that the internal / external service 550 reuses the caption data.

대내외 서비스(550)는 디지털 텔레비전(Digital Television, DTV) 재방송, 케이블, 위성 플랫폼에서의 방송 프로그램 재사용 또는 웹이나 모바일 멀티미디어 서비스 플랫폼에서의 주문형 비디오(Video On Demand, VOD) 자막 또는 외국어 번역 초벌 생성 등과 같은 다양한 용도의 어플리케이션에 해당할 수 있다.Internal and external services 550 may include re-broadcasting digital television (DTV), reusing broadcast programs on a cable or satellite platform, or generating video on demand (VOD) subtitles or foreign language translations on web or mobile multimedia services platforms And so on.

캡션 자막 보정 장치(520)는 CRS(Closed caption recorder)를 포함할 수 있다. CRS는 도 1의 수신부(110), 자막 데이터 생성부(120), 스케줄 모니터링부(150)를 포함하는 구성일 수 있다. CRS는 속기사로부터 자막 모뎀(510)으로 캡션 자막이 수신되면, 캡션 자막을 오리지널 자막 파일로 변환할 수 있다. CRS는 오리지널 자막 파일을 중앙 관리 시스템(540)으로 업로드할 수 있다. 예를 들어, 오리지널 자막 파일은 SAMI(Synchronized Accessible Media Interchange) 포맷을 가질 수 있고, 확장자는 .smi 를 가진다.The caption subtitle correction apparatus 520 may include a closed caption recorder (CRS). The CRS may be configured to include the receiving unit 110, the caption data generating unit 120, and the schedule monitoring unit 150 of FIG. The CRS can convert the caption subtitle into an original subtitle file when the caption subtitle is received from the speeding device through the caption modem 510. [ The CRS may upload the original subtitle file to the central management system 540. [ For example, an original subtitle file may have a Synchronized Accessible Media Interchange (SAMI) format and have an extension of .smi.

CRS는 APC(530)로부터 수신하는 매일의 프로그램 정보를 사용하여, 자막 모뎀(510)으로부터 수신하는 캡션 자막의 수신 시간과 프로그램 스케줄 정보를 비교하여 자막 파일의 텍스트 라인에 상대적 타임 스탬프를 생성할 수 있다.The CRS can generate a relative time stamp on the text line of the subtitle file by comparing the reception time of the caption subtitle received from the subtitle modem 510 with the program schedule information using the daily program information received from the APC 530 have.

미리 녹화된 프로그램의 경우, 프로그램이 종료되고 바로 자막 파일이 생성되어, CMS(540)로 업로드 될 수 있다. 실시간 프로그램의 경우, 스케줄 정보가 예상치 못한 스케줄을 반영하지 못하기 때문에 프로그램이 종료된 후, 대략 10분 정도 후에 자막 파일의 생성이 완료된다. CRS는 새로운 자막 파일을 업로드 할 때, 고유 프로그램 ID(PID), 언어 코드 등과 같은 정보를 함께 CMS(540)로 업로드 할 수 있다.In the case of a prerecorded program, a subtitle file may be created immediately after the program is terminated and uploaded to the CMS 540. In the case of a real-time program, since the schedule information does not reflect an unexpected schedule, the subtitle file creation is completed after about 10 minutes after the program ends. When uploading a new subtitle file, the CRS may upload information such as unique program ID (PID), language code, etc. to the CMS 540 together.

오리지널 자막 파일을 이용하여, 캡션 자막의 보정을 위한 후처리(post-processing)가 보다 쉽게 이루어질 수 있다. 오리지널 자막 파일의 모든 문장은 앤드 마크(end mark)를 포함한다. 앤드 마크는 예를 들어, 마침표, 물음표, 느낌표 등을 의미한다. 화자가 변경되면, 문장은 데쉬 마크로 시작할 수 있고, 문장의 컬러가 변경될 수 있다. 그리고 클로즈드 캡션 자막이 비음성 요소(예를 들어, 바람 소리)를 포함하는 경우에는 지문(parenthesis)으로 삽입될 수 있다.Post-processing for correcting the caption subtitle can be performed more easily by using the original subtitle file. Every sentence in the original subtitle file contains an end mark. An end mark means, for example, a period, a question mark, an exclamation point, and so on. If the speaker changes, the sentence can start with a dash and the color of the sentence can change. And may be inserted as a parenthesis if the closed caption subtitles include non-speech elements (e.g., wind sounds).

속독 또는 음성 인식을 통한 방법이든 간에, 온라인 캡셔닝(online captioning)에는 지연(delay)이 발생할 가능성이 크다. 각 문장 별로 지연되는 시간은 다를 수 있다. 이러한 불규칙한 지연으로 인하여, 캡션 자막은 매칭이 되지 않는 화면에 표시될 수도 있다. 따라서, 지연이 발생한 오리지널 자막 파일이 VOD 또는 DVD의 자막으로 사용되면, 시청자에게 불편을 초래할 수 있다. 클로즈드 캡션 자막의 시간 지연은 시각적 만족감에 미치는 영향이 크다. 또한, 자막의 텍스트 포맷은 어플리케이션의 종류에 따라 다를 수 있다. 그러나, 수동적으로, 지연 시간을 보정하고, 포맷을 변경하기 위해서는 비용이나 시간의 소모가 큰 불편이 있다.Whether it is by means of read-only or speech recognition, there is a great likelihood of delays in online captioning. The delay time for each sentence can be different. Due to this irregular delay, the caption subtitles may be displayed on a screen that is not matched. Therefore, if an original subtitle file in which a delay occurs is used as a subtitle of a VOD or a DVD, inconvenience may be caused to the viewer. The time delay of closed caption subtitles has a large effect on visual satisfaction. In addition, the text format of the subtitle may be different depending on the type of application. However, there is a great inconvenience of cost and time in manually adjusting the delay time and changing the format.

한국의 방송국에서는 실 시간 클로즈드 캡셔닝(real-time closed captioning)을 위해 음성 인식을 사용하지 않는다. 왜냐하면, 클로즈드 캡션 자막을 생성해야 할 채널이나 프로그램의 수가 많지 않고, 한글의 음성 인식의 정확도에 대한 염려가 여전히 존재하기 때문이다. 방송 환경에서는 정확도가 보다 엄격하게 요구된다.Korean broadcasters do not use speech recognition for real-time closed captioning. This is because there is not a large number of channels or programs to generate closed caption subtitles and there is still a concern about the accuracy of speech recognition of Hangul. In a broadcast environment, more stringent accuracy is required.

그러나, 불규칙한 시간 지연이 반영된 오리지널 자막 파일에 대하여, 음성 인식을 사용하는 방식은 dictation보다 기술적으로 유용한 방식이 될 수 있다. 또한, 보정을 위해 추가되는 비용이 최소화될 수 있다. 따라서, 시간을 조절하고, 텍스트의 포맷을 변경할 수 있는 자동화된 서버 프로그램이 구현될 필요가 있다. 물론, TV 프로그램들은 다양한 장르를 포함할 수 있기 때문에 음성 인식에 적합하지 않은 경우도 있다. 그러므로, 음성 인식에 적합하지 않은 예외적인 경우를 별도로 구별하고, 이를 보완하기 위한 방법으로 수동적인 보정 또한 함께 이루어질 필요가 있다. However, the method of using speech recognition for an original subtitle file in which an irregular time delay is reflected may be technically more useful than dictation. In addition, the cost added for correction can be minimized. Therefore, an automated server program needs to be implemented that can adjust time and change the format of the text. Of course, TV programs may not be suitable for speech recognition because they may include various genres. Therefore, an exceptional case which is not suitable for speech recognition needs to be distinguished separately, and passive correction must also be performed as a method for compensating the exception.

자동화 서버 프로그램(WiseSync)은 캡션 자막 보정 장치(520)에서 사용될 수 있다. 예를 들어, 위 프로그램은 도 1의 보정부(130), 폴더 모니터링부(140), 스케줄 모니터링부(150), 필터링부(160)가 각 기능을 수행하는 과정에서 사용될 수 있다. 이 프로그램의 목적은 자동적으로 자막의 포맷을 변경하고, 정렬하는 것에 있다. 위와 같은 목적에 따라, 이름을 WiseSync로 정하게 되었다.The automation server program (WiseSync) can be used in the caption subtitle correction apparatus 520. [ For example, the above program can be used in the process of performing the functions of the controller 130, the folder monitoring unit 140, the schedule monitoring unit 150, and the filtering unit 160 of FIG. The purpose of this program is to automatically change the format of the subtitles and to sort them. For this purpose, we decided to name it WiseSync.

예를 들어, WiseSync의 입력은 DTV 클로즈드 캡션 자막을 저장하는 CRS에 의하여 생성된 오리지널 자막 파일일 수 있다. DTV 클로즈드 캡션 자막은 롤-업(roll-up) 스타일로 디코딩되어 생성될 수 있다. 자막의 단어들은 왼쪽으로부터 오른쪽으로, 40 바이트의 최대 줄 길이를 채울 때까지 나타날 수 있다. 때로는 단어들 중간에서 줄 바뀜(line change)이 발생할 수 있다.For example, the input of WiseSync may be an original subtitle file generated by the CRS that stores DTV closed caption subtitles. DTV closed caption subtitles can be decoded and generated in a roll-up style. Subtitle words may appear from left to right until the maximum line length of 40 bytes is filled. Sometimes a line change occurs in the middle of words.

WiseSync에서 음성 인식을 이용하여 자막을 재정렬하기 전에, 어플리케이션의 목적에 따라 자막의 포맷을 변경할 수 있다. 예를 들어, VOD의 경우, WiseSync는 단어들의 중간에서 줄 바뀜 없이 팝 온(pop-on) 스타일로 자막의 포맷을 변경할 수 있다. 캡션 자막의 최대 길이 및 높이는 미리 설정될 수 있다. 번역을 위한 드래프트에서는 매 줄(line)마다 완성된 문장이 포함되어야 한다. Before reordering subtitles using speech recognition in WiseSync, you can change the format of subtitles depending on the application's purpose. For example, in the case of VOD, WiseSync can change the format of subtitles in a pop-on style without losing the middle of words. The maximum length and height of the caption subtitles can be set in advance. In the draft for translation, the completed sentence must be included for each line.

포맷이 변경된 캡션 자막은 음성 인식을 통하여 새로운 타임스탬프를 획득할 수 있다. 포맷이 변경된 캡션 자막의 단어들은 음성 사전을 만드는데 사용될 수 있다. 음성 사전에서 첫 번째 단어부터 마지막 단어까지 발화의 시간이 연속적으로 매칭될 수 있다. 오리지널 자막 파일의 타임스탬프를 사용함으로써, 음성 사전에서 자막의 단어와 매칭되는 부분을 검색하기 위한, 검색 범위(search range)가 좁혀질 수 있다. 검색 범위를 좁힘으로써, 후처리 결과의 정확도가 최대화될 수 있다.The caption subtitle changed in format can acquire a new time stamp through voice recognition. The words of the caption subtitles whose format has been changed can be used to make a speech dictionary. The time of the utterance from the first word to the last word in the speech dictionary can be continuously matched. By using the time stamp of the original subtitle file, the search range for searching the portion of the speech dictionary matching the word of the subtitle can be narrowed. By narrowing the search range, the accuracy of the post-processing result can be maximized.

WiseSync는 와치 폴더 기능을 자동으로 수행할 수 있고, CMS(540)와 자막 데이터를 주고 받을 수 있다. WiseSync는 시간 동기를 보정해야 할 자막 파일을 찾기 위해, 할당된 폴더를 모니터링 할 수 있다. 영상 파일과 자막 파일이 동일한 파일 이름을 가지고 있으면, 영상 파일과 자막 파일은 한 쌍으로 간주될 수 있고, 상기 후처리 동작이 수행될 수 있다. 또는, 영상 파일의 이름이 프로그램 ID를 포함하면, CMS(540)로부터 동일한 프로그램 ID를 가지는 오리지널 자막 파일을 검색하는 동작이 수행될 수 있다.WiseSync can automatically perform the watch folder function, and can exchange caption data with the CMS 540. WiseSync can monitor the assigned folder to find subtitle files that need to be time synchronized. If the video file and the subtitle file have the same file name, the video file and the subtitle file can be regarded as a pair, and the post-processing operation can be performed. Alternatively, if the name of the video file includes the program ID, an operation of retrieving the original subtitle file having the same program ID from the CMS 540 can be performed.

WiseSync의 경우, 전반적으로 상당한 정확도를 보여주지만, 음악 쇼 또는 콘서트(예를 들어, 뮤직 뱅크, 개그 콘서트)와 같은 특정 장르에서는 예외이다. 따라서, WiseSync는 후처리로부터 예외에 해당하는 프로그램들을 필터링하기 위해, 방송 프로그램 스케줄로부터 특정 프로그램을 검색하는 기능을 수행할 수 있다. 방송 프로그램 리스트 중에 후 처리를 진행하지 않는 프로그램이 포함되면, 해당 프로그램의 이후 에피소드에 대해서는, 후 처리 동작 요건이 만족되는 경우에도, 후처리 동작이 수행되지 않을 수 있다.WiseSync has a fairly good overall accuracy, but is exceptional in certain genres such as music shows or concerts (for example, Music Bank, Gag Concert). Therefore, WiseSync can perform a function of searching for a specific program from the program schedule to filter the programs corresponding to the exception from the post-processing. If a program that does not carry out a post-process is included in the broadcast program list, a post-process operation may not be performed for subsequent episodes of the program even when the post-process operation requirement is satisfied.

단말 프로그램(WiseCat)은 캡션 자막 보정 장치(520)에서 사용될 수 있다. 예를 들어, 위 프로그램은 도 1의 수동 보정부(170)가 그 기능을 수행하는 과정에서 사용될 수 있다. WiseCat의 목적은 WiseSync의 동작을 보완하기 위한 것이다. WiseCat은 일반 PC에서 자막을 편집하기 위한 형태를 가질 수 있으며, 약간의 특별한 기능을 수행할 수 있다. The terminal program (WiseCat) can be used in the caption subtitle correction apparatus 520. For example, the above program may be used in the course of performing the function of the manual correction unit 170 of FIG. The purpose of WiseCat is to complement the behavior of WiseSync. WiseCat can have a form for editing subtitles on a regular PC, and can perform some special functions.

WiseCat은 원격으로 WiseSync에 접속하고 싶을 때 사용될 수 있고, 원격 접속을 통하여, WiseSync의 기능을 사용할 수 있다. WiseCat은 사용자의 입력에 따라 수동적으로 캡션 자막의 시간을 변환(shift)할 수 있다. 또한, WiseCat은 맞춤범 체크 또는 텍스트 컬러링과 같은 기타 편집 기능을 사용하여 자막을 편집할 수 있다.WiseCat can be used when you want to connect to WiseSync remotely, and you can use WiseSync function through remote connection. WiseCat can manually shift the caption subtitle time according to the user's input. WiseCat also allows you to edit subtitles using other editing features such as customized checks or text coloring.

하나의 방송 프로그램에 대해서도 다양한 영상 버전이 존재할 수 있다. 다양한 영상 기준에 따라 다르게 편집되거나 트림(trim)될 수 있기 때문이다. 예를 들어, 인코딩된 영상과 편집 영상이 있을 수 있다. 인코딩된 영상은 영상 아카이브(video archive)에 저장될 수 있고, 편집 영상은 통합 CMS(540) hub에 저장될 수 있다. 영상이 편집되거나 트림되면, 이전의 동기를 유지할 수 없다. 따라서, 자막 버전을 새로운 영상 버전에 따라 업데이트할 필요가 있다. CRS에서 자막의 편집 버전을 생성하면, CMS(540)로 업로드될 수 있다. 대내외 서비스(550)와 같은 다른 시스템은 목적에 따라 적당한 자막 편집 버전의 캡션 자막을 다운로드하여 사용할 수 있다. 그러므로, CRS에서 자막 파일을 CMS(540)로 업로드하는 경우에, 시스템 ID, 고유 ID, 편집 버전, 프로그램 ID, 언어 코드 등이 함께 업로드 될 수 있다.Various video versions may exist for one broadcast program. It can be edited or trimmed differently according to various image standards. For example, there may be an encoded image and an edited image. The encoded image may be stored in a video archive, and the edited image may be stored in an integrated CMS 540 hub. If the image is edited or trimmed, the previous synchronization can not be maintained. Therefore, it is necessary to update the subtitle version according to the new video version. Once the CRS creates an edited version of the subtitles, it can be uploaded to the CMS 540. Other systems, such as internal and external services 550, may download and use caption subtitles of appropriate caption editing versions according to the purpose. Therefore, when a subtitle file is uploaded to the CMS 540 in the CRS, a system ID, a unique ID, an edition version, a program ID, a language code, and the like can be uploaded together.

시간 동기가 보정된 클로즈드 캡션 자막은 메타데이터로 사용될 수 있다. 예를 들어, 영상 아카이브 상에 저장된 콘텐츠로부터 원하는 장면을 검색하기 위해, 클로즈드 캡션 자막이 사용될 수 있다. 예를 들어, 특정 인물 또는 특정 장소가 영상에서 언급되는 경우의 장면이, 클로즈드 캡션 자막을 통해 검색될 수 있다.
Closed caption subtitles with time synchronization corrected can be used as metadata. For example, closed caption subtitles can be used to retrieve a desired scene from content stored on a video archive. For example, a scene when a specific person or a specific place is referred to in an image can be retrieved through closed caption subtitles.

도 6은 본 발명의 일실시예에 따른 캡션 자막 보정 방법의 흐름도이다.6 is a flowchart of a caption subtitle correction method according to an embodiment of the present invention.

610단계에서, 일 실시예에 따른 캡션 자막 보정 장치는 방송의 음성에 대응한 속기(stenography)를 통해 생성된 캡션(closed caption) 자막 및 방송의 편성 스케줄 정보를 수신할 수 있다.In operation 610, the caption subtitle correction apparatus according to an exemplary embodiment may receive caption closed caption and broadcast schedule schedule information generated through stenography corresponding to a voice of a broadcast.

620단계에서, 일 실시예에 따른 캡션 자막 보정 장치는 캡션 자막의 수신 시간에 기초하여, 편성 스케줄 정보에 포함된 방송 프로그램들 중에서 캡션 자막과 대응하는 방송 프로그램을 검색할 수 있다.In operation 620, the caption subtitle correction apparatus according to an exemplary embodiment may search for broadcast programs corresponding to the caption subtitles in the broadcast programs included in the composition schedule information, based on the reception time of the caption subtitles.

630단계에서, 일 실시예에 따른 캡션 자막 보정 장치는 검색된 방송 프로그램의 정보를 캡션 자막과 매칭하여 자막 파일 및 메타 데이터를 생성할 수 있다. 메타 데이터는 자막 데이터로 명명될 수 있으며, 방송 프로그램의 제목, 상기 방송 프로그램의 방송 시간, 방송 프로그램에 대한 간단한 설명, 방송 프로그램의 고유 식별자, 방송 프로그램의 방송 차수, 캡션 자막의 편집 버전, 자막 파일 형식 및 언어코드 중 적어도 하나를 포함할 수 있다.In operation 630, the caption subtitle correction apparatus according to an exemplary embodiment may generate the subtitle file and the metadata by matching the information of the searched broadcast program with the caption subtitle. The metadata may be named caption data. The metadata may include a title of the broadcast program, a broadcast time of the broadcast program, a brief description of the broadcast program, a unique identifier of the broadcast program, a broadcast degree of the broadcast program, Format, and language code.

640단계에서, 일 실시예에 따른 캡션 자막 보정 장치는 방송의 음성 인식 결과에 기초하여, 자막 파일의 각 캡션 자막들의 텍스트 포맷과 시간 동기값을 보정할 수 있다.In operation 640, the caption subtitle correction apparatus according to an exemplary embodiment may correct the text format and the time synchronization value of each caption subtitles of the caption file based on the speech recognition result of the broadcast.

일 실시예에 따른 캡션 자막 보정 장치는 방송 프로그램의 동영상에서 음성을 인식하고, 인식된 음성의 타임 스탬프와 자막 데이터의 타임 스탬프를 비교하여, 캡션 자막의 시간 동기를 보정할 수 있다.The caption subtitle correcting apparatus according to an embodiment can recognize the voice in the moving picture of the broadcast program, compare the time stamp of the recognized voice with the time stamp of the caption data, and correct time synchronization of the caption subtitle.

일 실시예에 따른 캡션 자막 보정 장치는 캡션 자막의 용도에 설정된 포맷으로 캡션 자막의 포맷을 변경하고, 방송 프로그램의 동영상에서 음성을 인식하며, 캡션 자막의 각 줄의 첫 번째 나타나는 단어에 할당된 신규 타임 스탬프를 음성 인식을 통해 생성된 신규 타임 스탬프로 교체함으로써, 시간 동기를 보정할 수 있다.The caption subtitle correction apparatus according to an embodiment changes the format of the caption subtitle in the format set for the use of the caption subtitle, recognizes the voice in the moving picture of the broadcast program, By replacing the time stamp with a new time stamp generated through speech recognition, time synchronization can be corrected.

도 6의 각 단계에서 캡션 자막 보정 장치가 수행하는 방법에는 도 1 내지 도 4에서 설명된 방식이 동일하게 적용될 수 있다.
The method of FIG. 1 to FIG. 4 may be applied to the method performed by the caption subtitle correction apparatus in each step of FIG.

도 7a 내지 도 7c는 본 발명의 일실시예에 따라 음성사전을 이용하여 음성 인식하는 예들을 나타낸 도면이다.7A to 7C are views illustrating examples of speech recognition using a speech dictionary according to an embodiment of the present invention.

도 7a를 참조하면, 도 2, 3, 4의 음성 인식부는 음성을 인식할 때 도 1의 자막 데이터 생성부(120)에서 생성한 자막 파일의 텍스트로 만든 음성 사전(도 7a의 단어 사전)에서 단어를 검색하여 매칭된 단어를 음성의 인식 결과로 채택하여, 음성의 발화 시점을 신규 타임 스탬프로 생성할 수 있다. Referring to FIG. 7A, the speech recognition unit of FIGS. 2, 3 and 4 extracts a speech dictionary (a word dictionary of FIG. 7A) created by text of a caption file generated by the caption data generation unit 120 of FIG. It is possible to search for a word and employ the matched word as a speech recognition result, thereby generating a speech time point of the speech with a new time stamp.

도 7b를 참조하면, 음성의 한 구간(예를 들어, 발화 2)만 인식하고자 할 때는, 도 1의 자막 데이터 생성부(120)에서 생성한 자막 파일의 텍스트 중 해당 음성 구간의 발화 시점을 기준으로 기 설정된 마진(margin)의 범위 안의 타임 스탬프를 가진 텍스트들만으로 음성 사전을 만든 뒤, 해당 음성 사전의 단어들만 검색하여 매칭된 단어를 음성의 인식 결과로 채택하는 과정을 취할 수 있다. Referring to FIG. 7B, when it is desired to recognize only one section of speech (for example, utterance 2), the utterance timing of the corresponding section of the text of the subtitle file generated by the subtitle data generator 120 of FIG. A speech dictionary may be created using only texts having a time stamp within a predetermined range of margins, and only the words of the speech dictionary may be searched to adopt a matched word as a speech recognition result.

도 7c를 참조하면, 도 2, 3, 4의 음성 인식부는 전체 영상에 대한 음성 인식을 수행할 때도, 구간 음성 인식 매칭 과정을 반복함으로써, 음성 인식 시 참조해야 할 사전의 범위를 줄일 수 있고, 전체 음성 인식의 정확도를 높일 수 있다. Referring to FIG. 7C, the speech recognition unit of FIGS. 2, 3 and 4 can reduce the range of a dictionary to be referred to in speech recognition by repeating the speech recognition recognition process even when performing speech recognition on the entire image. The accuracy of the entire speech recognition can be improved.

본 발명의 실시 예에 따른 방법들은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. The methods according to embodiments of the present invention may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions recorded on the medium may be those specially designed and constructed for the present invention or may be available to those skilled in the art of computer software.

이상과 같이 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상의 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다.While the invention has been shown and described with reference to certain preferred embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. This is possible.

그러므로, 본 발명의 범위는 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등한 것들에 의해 정해져야 한다.Therefore, the scope of the present invention should not be limited to the described embodiments, but should be determined by the equivalents of the claims, as well as the claims.

Claims

A receiving unit for receiving a caption subtitle generated through stenography corresponding to a voice of a broadcast and schedule schedule information of the broadcast;
Searching for a broadcasting program corresponding to the caption subtitle from the broadcasting programs included in the programming schedule information based on the reception time of the caption subtitle and matching the information of the searched broadcasting program with the caption subtitle, A subtitle data generating unit for generating a subtitle data; And
A correcting unit for correcting a text format and a time synchronization value of each caption subtitles of the subtitles file based on a purpose of using the subtitles file and a speech recognition result of the broadcast,
Lt; / RTI >
The correction unit
A voice recognition unit for recognizing a voice in a moving picture of the broadcast program; And
A time synchronizing corrector for comparing the time stamp of the recognized voice with the time stamp of the caption data to correct time synchronization of the caption subtitle;
/ RTI >
The time synchronization corrector
And performs a section correction based on a portion matched in a speech dictionary in a range of a predetermined margin based on a time stamp of the recognized speech.

The method according to claim 1,
Wherein the caption file includes a caption text of a corresponding program and a time stamp which is a time position at which each caption text appears,
The caption data may include at least one of a title of the broadcast program, a broadcast time of the broadcast program, a brief description of the broadcast program, a unique identifier of the broadcast program, a broadcast degree of the broadcast program, an edited version of the caption subtitle, Language code that includes at least one of
Caption subtitle correction device.

delete

A receiving unit for receiving a caption subtitle generated through stenography corresponding to a voice of a broadcast and schedule schedule information of the broadcast;
Searching for a broadcasting program corresponding to the caption subtitle from the broadcasting programs included in the programming schedule information based on the reception time of the caption subtitle and matching the information of the searched broadcasting program with the caption subtitle, A subtitle data generating unit for generating a subtitle data; And
A correcting unit for correcting a text format and a time synchronization value of each caption subtitles of the subtitles file based on a purpose of using the subtitles file and a speech recognition result of the broadcast,
Lt; / RTI >
The correction unit
A format changing unit for changing a format of the caption subtitle in a format set in accordance with a use purpose of the caption subtitle;
A voice recognition unit for recognizing a voice in a moving picture of the broadcast program;
A time synchronization correcting unit for correcting time synchronization by replacing a new time stamp assigned to a first appearing word of each line of the caption subtitle with a new time stamp generated by the speech recognition unit;
A correcting unit for checking and correcting the spelling of the caption subtitle and the time synchronization correction error; And
A text coloring part for coloring the text of the caption subtitle so as to be distinguished by a speaker of the broadcast program,
Caption correction unit.

delete

A receiving unit for receiving a caption subtitle generated through stenography corresponding to a voice of a broadcast and schedule schedule information of the broadcast;
Searching for a broadcasting program corresponding to the caption subtitle from the broadcasting programs included in the programming schedule information based on the reception time of the caption subtitle and matching the information of the searched broadcasting program with the caption subtitle, A subtitle data generating unit for generating a subtitle data;
A correcting unit correcting a text format and a time synchronization value of each caption subtitles of the caption file based on a purpose of using the caption file and a speech recognition result of the broadcast; And
If the video file name of the broadcast program included in the folder is the same as the name of the caption data, the name of the video file name and the name of the caption data are monitored for the purpose of correcting the time synchronization of the caption caption included in the caption data Folder monitoring unit
Caption correction unit.

A receiving unit for receiving a caption subtitle generated through stenography corresponding to a voice of a broadcast and schedule schedule information of the broadcast;
Searching for a broadcasting program corresponding to the caption subtitle from the broadcasting programs included in the programming schedule information based on the reception time of the caption subtitle and matching the information of the searched broadcasting program with the caption subtitle, A subtitle data generating unit for generating a subtitle data;
A correcting unit correcting a text format and a time synchronization value of each caption subtitles of the caption file based on a purpose of using the caption file and a speech recognition result of the broadcast; And
And a schedule monitoring unit for monitoring the program schedule for the purpose of correcting time synchronization of the caption subtitles at a predetermined period based on the program scheduling information of the broadcast,
Caption correction unit.

A receiving unit for receiving a caption subtitle generated through stenography corresponding to a voice of a broadcast and schedule schedule information of the broadcast;
Searching for a broadcasting program corresponding to the caption subtitle from the broadcasting programs included in the programming schedule information based on the reception time of the caption subtitle and matching the information of the searched broadcasting program with the caption subtitle, A subtitle data generating unit for generating a subtitle data;
A correcting unit correcting a text format and a time synchronization value of each caption subtitles of the caption file based on a purpose of using the caption file and a speech recognition result of the broadcast; And
A filtering unit for filtering the specific program from the broadcast schedule schedule information so that the correction unit does not operate in the specific program,
Caption correction unit.

6. The method of claim 5,
A manual correction unit that corrects time synchronization of the caption subtitle and confirms and confirms a correction result of the caption subtitle based on time synchronization input from the user among the caption subtitles according to a user's need;
Further comprising a caption subtitle correction unit.

Receiving a closed caption subtitle generated through stenography corresponding to a voice of a broadcast and schedule schedule information of the broadcast;
Retrieving a broadcast program corresponding to the caption subtitle from the broadcast programs included in the programming schedule information based on the reception time of the caption subtitle;
Generating subtitle file and subtitle data by matching information of the searched broadcasting program with the caption subtitle; And
Correcting the text format and the time synchronization value of each caption subtitles of the caption file based on the speech recognition result of the broadcast
Lt; / RTI >
The step of correcting the text format and the time synchronization value
Recognizing a voice in a moving picture of the broadcast program; And
Comparing the time stamp of the recognized voice with the time stamp of the caption data to correct time synchronization of the caption subtitle;
Lt; / RTI >
The step of correcting the time synchronization of the caption subtitle
And performing a section correction based on a portion matched in a speech dictionary in a range of a predetermined margin based on a time stamp of the recognized speech.

delete

12. The method of claim 11,
The step of correcting
Changing a format of the caption subtitle in a format set in accordance with a use purpose of the caption subtitle; And
Recognizing a voice in a moving picture of the broadcast program; And
Correcting the time synchronization by replacing the new time stamp assigned to the first appearing word of each line of the caption subtitle with a new time stamp generated through speech recognition
And a caption subtitle correction method.