KR20040034338A

KR20040034338A - Sync signal insertion/detection method and apparatus for synchronization between audio contents and text

Info

Publication number: KR20040034338A
Application number: KR1020030024306A
Authority: KR
Inventors: 신승원; 이원하; 김남훈
Original assignee: (주)마크텍; (주)디지탈플로우
Priority date: 2003-03-15
Filing date: 2003-04-17
Publication date: 2004-04-28
Also published as: KR20050117607A; KR100577558B1

Abstract

PURPOSE: A synchronization signal insertion and detection method and system is provided to synchronize audio contents with text by inserting a synchronization signal into an audio file so that it minimizes an effect of text synchronization to an audio quality. CONSTITUTION: The method comprises several steps. An MP3 audio file to be played is selected and is divided into frame units(S301). A frame analysis is performed for each divided frame(S303). In a case that a space for inserting a synchronization signal into is necessary(S305), an arbitrary space for a stuffing space is generated(S307). Then, one byte is newly allotted for the stuffing space and position addresses of all the posterior frames are increased by one byte(S309). It is determined whether a synchronization signal has to be inserted into a corresponding frame(S311). If so, the synchronization signal is inserted into a stuffing space(S313).

Description

SYNC SIGNAL INSERTION / DETECTION METHOD AND APPARATUS FOR SYNCHRONIZATION BETWEEN AUDIO CONTENTS AND TEXT}

본 발명은 디지털 휴대용 재생 장치 (portable digital playback device) 에서 디지털 오디오 컨텐츠와 그에 대응하는 텍스트 사이의 동기화 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for synchronizing between digital audio content and corresponding text in a portable digital playback device.

최근, 컴퓨터 기술의 발전에 부응하여, 컴퓨터를 사용하여 오디오 컨텐츠를 재생하는 기술이 빠르게 발전하고 있다. 이에 따라, 오디오 컨텐츠를 재생함과 동시에 오디오 컨텐츠의 내용을 시각적으로 표시해주는 기능이 주목을 받고 있다.예컨대, 가요에 관한 오디오 컨텐츠를 재생함과 동시에 그 가사를 화면에 표시하는 기술이 이에 해당된다.In recent years, in response to the development of computer technology, a technology for reproducing audio content using a computer has been rapidly developed. Accordingly, attention has been paid to a function of visually displaying the contents of the audio contents while playing the audio contents. For example, the technology of displaying the lyrics on the screen while playing the audio contents related to the song corresponds to this. .

도 10 을 참조하여, 종래기술에서의 오디오 컨텐츠의 재생시 컨텐츠 내용을 동시에 표시하는 구성을 설명하기로 한다.Referring to Fig. 10, a description will be given of a configuration for simultaneously displaying content contents during reproduction of audio contents in the prior art.

먼저, 재생 대상이 되는 오디오 컨텐츠, 및 오디오 컨텐츠의 내용을 저장하고 있는 텍스트 파일이 마련된다. 도 10 은 종래의 오디오 컨텐츠의 내용을 저장하는 텍스트 파일을 테이블의 형태로 재구성한 도면이다. 도 10 에서, 텍스트 파일에는 오디오 컨텐츠의 내용 뿐만 아니라, 그 오디오 컨텐츠의 내용을 시각적으로 표시하는 재생시점이 저장되어 있다. 도 10 의 예에서는, 압축된 음성 또는 음악 파일이 재생되는 중에, 텍스트를 출력할 시간을 알려주는 재생시점이 1/1000 초 단위로 저장되어 있다.First, an audio content to be played back and a text file storing the content of the audio content are provided. FIG. 10 is a diagram illustrating a reconstruction of a text file storing contents of conventional audio content in the form of a table. In Fig. 10, the text file stores not only the contents of the audio contents but also a playback time for visually displaying the contents of the audio contents. In the example of FIG. 10, while the compressed voice or music file is being reproduced, a reproduction time point indicating the time to output the text is stored in units of 1/1000 second.

예컨대, 재생시점 0000040 ms 에서, 오디오 컨텐츠가 재생되고, 그 오디오 컨텐츠에 대응되는 "이 발명은 휴대용 디지털 재생장치에서" 라는 문자열이 소정의 디스플레이를 통하여 시각적으로 출력된다. 오디오 컨텐츠가 재생됨에 따라, 재생시점 0001055 ms 에서 오디오 컨텐츠의 재생과 동시에 "음악이나 음성 파일을 재생하는 동안에" 라는 문자열이 출력된다.For example, at a playback time of 0000040 ms, audio content is played back, and a character string corresponding to the audio content is visually output through a predetermined display. As the audio content is played back, at the playback time 0001055 ms, at the same time as the audio content is played, a string "while playing music or an audio file" is output.

즉, 오디오 컨텐츠를 재생시키면서 재생시점을 감시하여, 재생시점이 테이블에 나타난 출력 문자열의 재생시점과 일치하는 경우에, 출력 문자열이 출력되도록 한다.That is, the playback time is monitored while the audio content is being played back, so that the output string is output when the playback time coincides with the playback time of the output string shown in the table.

상기와 같은 텍스트 파일의 구조는 동영상에 자막을 출력하기 위한 예컨대".smi 파일" 의 구조와 실질적으로 유사한 것으로, 컴퓨터와 같이 사용가능한 리소스가 충분히 제공되는 경우에 적합한 구조이다.The structure of such a text file is substantially similar to that of, for example, a ".smi file" for outputting subtitles in a video, and is suitable for a case where sufficient resources are available, such as a computer.

그러나, 상술의 방법으로 디지털 오디오 컨텐츠와 그에 대응하는 텍스트를 디지털 휴대용 재생 장치에서 동기화시키는 경우에는 사용가능한 리소스에 한계가 있다. 따라서, 디지털 휴대용 재생 장치에서 오디오 컨텐츠의 ms 단위의 재생시간을 감시하고, 이러한 미세한 재생시간에 일치하여 텍스트를 출력하는 것은 실제적으로는 가능하지 않다. 그 때문에, 텍스트 파일에 재생시간 및 텍스트를 테이블의 형식으로 저장하여 테이블의 정보에 기초하여 텍스트를 출력하는 상술한 방법은 디지털 휴대용 재생 장치에서는 적당하지 않다.However, there is a limit to the resources available when synchronizing the digital audio content and the corresponding text in the digital portable playback apparatus by the above-described method. Therefore, it is not practical to monitor the playback time in ms of audio content in the digital portable playback device and output the text in accordance with the minute playback time. Therefore, the above-described method of outputting text based on the information of the table by storing the reproduction time and the text in the form of a table in a text file is not suitable for the digital portable playback apparatus.

또한, 종래의 텍스트를 출력하는 방법에서는 재생되는 시간에 따라 임의로 텍스트 정보를 액정화면에 출력하기 때문에, 실제로 재생되는 내용과 액정에 출력되는 내용이 일치하지 않는 문제점이 있었다.In addition, in the conventional method of outputting text, since text information is arbitrarily output on the LCD screen according to the time to be reproduced, there is a problem that the content actually reproduced does not coincide with the content output to the liquid crystal.

다음으로, 디지털 오디오 컨텐츠에 동기신호를 주파수 변환 등을 통하여 워터마크로 하여 삽입하는 방법을 살펴보기로 한다. 일반적으로, 워터마킹 기술은 저작물에 대한 저작권 보호, 저작물의 위·변조 유무 판별 등을 위하여 음원에 일반인들이 인식하지 못하는 저작물의 정보를 저장하는 기술을 의미한다. 워터마킹 기술은 저작물의 실질적인 음원에 사용자가 정의한 정보를 은닉하기 때문에, 신호처리 공격, 압축 변환 등에도 강인하며 악의적인 목적으로 제거하기 어려운 특징을 갖는 강인한 워터마크 (robust watermark) 를 사용하는 것이 일반적이다.Next, a method of embedding a synchronization signal into a digital audio content as a watermark through frequency conversion or the like will be described. In general, the watermarking technology refers to a technology for storing information of a work which is not recognized by the general public in a sound recording for copyright protection of a work and for determining whether the work is forged or forged. Since watermarking technology conceals user-defined information in the actual sound source of the copyrighted work, it is common to use robust watermarks that are robust against signal processing attacks, compression conversion, etc., and that are difficult to remove for malicious purposes. to be.

이와 같은 워터마킹은 데이터를 디지털 컨텐츠의 음원에 삽입하기 때문에,은닉한 정보를 다시 검출해내기 위해서는 상당히 복잡한 연산과정이 수행되어야 하기 때문에, 많은 메모리 용량과 계산량이 수반되어야 한다. 워터마킹 기술을 통상 DSP 로 구현하기 위해서는 상당한 양의 리소스를 소모하기 때문에, DSP 를 사용하는 휴대용 MP3 플레이어와 같은 휴대용 디지털 재생 장치에는 사용하기 어려운 문제점이 있다. 또한, 많은 리소스를 소모하는 부가적인 기능은 휴대용 재생 장치의 제한된 배터리 사용시간을 고려할 때 바람직하지 않다. 특히, 대부분의 오디오 데이터는 대상 컨텐츠를 압축하는 포멧으로 되어 있기 때문에, 통상적인 워터마킹 기술은 사용가능하지 않다.Since watermarking inserts data into a sound source of digital content, a large amount of computation and computation must be performed because a fairly complicated calculation process must be performed in order to detect hidden information again. Since watermarking technology usually consumes a considerable amount of resources to implement a DSP, it is difficult to use a portable digital playback device such as a portable MP3 player using a DSP. In addition, an additional function that consumes a lot of resources is undesirable in view of the limited battery life of the portable playback device. In particular, since most audio data is in a format for compressing target content, conventional watermarking techniques are not available.

압축된 데이터에 정보를 은닉하는 기술은, F. Petitcolas 가 제안한 MP3Stego (Computer Laboratory, Cambridge, August, 1998) 에 개시되어 있다. 이 기술은 음원을 압축하는 과정 중에 데이터를 은닉하기 때문에 고속 삽입처리가 가능하지 않은 문제점이 있다.A technique for hiding information in compressed data is disclosed in MP3Stego (Computer Laboratory, Cambridge, August, 1998) proposed by F. Petitcolas. This technique has a problem in that high-speed insertion processing is not possible because data is concealed during the compression of the sound source.

또한, L. Qia 와 K. Nahrstedt 가 제안한 Non-Invertible Watermarking Methods For MPEG Encoded Audio (Security and watermarking of Multimedia Contents, January 1999) 에서는 MP3 의 음원을 변질시킬 우려가 높으며, 은닉가능한 정보량에 한계가 있는 문제점이 있다.Also, Non-Invertible Watermarking Methods For MPEG Encoded Audio (Security and watermarking of Multimedia Contents, January 1999) proposed by L. Qia and K. Nahrstedt has a high risk of altering the sound source of MP3 and has a limited amount of concealable information. There is this.

또한, D. K. Koukopoulos 와 Y. C. Stamatiou 가 제안한 A compressed-domain watermarking algorithm for MPEG Audio Layer3 (ACM Multimedia 2001, Septemper 30 - October 5, Ottawa, Ontario, Canada) 에서는 고속추출은 가능할 수 있으나, 고속 삽입처리는 가능하지 않은 문제점이 있다.In addition, high-speed extraction may be possible in A compressed-domain watermarking algorithm for MPEG Audio Layer 3 (ACM Multimedia 2001, Septemper 30-October 5, Ottawa, Ontario, Canada) proposed by DK Koukopoulos and YC Stamatiou, but high-speed insertion processing is not possible. There is a problem.

본 발명은, 상술한 바와 같은 문제점을 해결하기 위하여 안출된 것으로, 텍스트 동기화가 음질에 미치는 영향을 최소화하고, 오디오 컨텐츠의 재생시점과 텍스트 출력시점을 일치시키면서 고속 삽입/처리가 가능한, 오디오 컨텐츠와 텍스트를 동기화시킬 수 있도록 오디오 파일에 동기신호를 삽입시키는 동기신호 삽입 방법을 제공하는 것을 그 목적으로 한다.The present invention has been made to solve the above-described problems, and minimizes the effect of text synchronization on sound quality, and enables high-speed insertion / processing while matching playback time and text output time of audio content. It is an object of the present invention to provide a synchronization signal insertion method for inserting a synchronization signal into an audio file so as to synchronize text.

또한, 본 발명은 오디오 컨텐츠의 재생 및 그와 동기화되는 텍스트의 출력시에, 오디오 컨텐츠 재생장치에 과도한 리소스 소모가 발생하지 않도록 하는 방법을 제공하는 것을 그 목적으로 한다.It is also an object of the present invention to provide a method for preventing excessive resource consumption in an audio content playback apparatus during playback of audio content and output of text synchronized with it.

또한, 본 발명은 동기신호가 삽입되어 있는 오디오 파일로부터 동기신호를 검출하는 동기신호 검출 방법 및 장치를 제공하는 것을 그 목적으로 한다.Another object of the present invention is to provide a synchronization signal detecting method and apparatus for detecting a synchronization signal from an audio file into which a synchronization signal is inserted.

도 1 은 디지털 휴대용 재생 장치에서 오디오 파일과 그에 대응하는 텍스트를 동기화시키기 위한 전체적인 과정을 도시한 개념도이다.1 is a conceptual diagram illustrating an overall process for synchronizing an audio file and a corresponding text in a digital portable playback device.

도 2 는 MP3 프레임의 구조를 나타내는 도면이다.2 is a diagram illustrating a structure of an MP3 frame.

도 3 은 본 발명의 제 1 실시예에 따른 동기신호 삽입 과정을 나타낸 흐름도이다.3 is a flowchart illustrating a synchronization signal insertion process according to a first embodiment of the present invention.

도 4 는 본 발명의 제 2 실시예에 따른 동기신호 삽입 과정을 나타낸 흐름도이다.4 is a flowchart illustrating a synchronization signal insertion process according to a second embodiment of the present invention.

도 5 은 본 발명의 제 2 실시예에 따른 동기신호가 삽입된 오디오 파일을 프레임 단위로 도시한 개략도이다.FIG. 5 is a schematic diagram illustrating an audio file in which a synchronization signal is inserted according to a second embodiment of the present invention in units of frames.

도 6 은 TTS 기술로 생성된 음성 파일과 텍스트를 동기화시키는 과정을 도시한 개념도이다.6 is a conceptual diagram illustrating a process of synchronizing text with a voice file generated by the TTS technology.

도 7 은 본 발명에 따른 동기신호 검출 과정을 개략적으로 설명한 개략도이다.7 is a schematic diagram schematically illustrating a synchronization signal detection process according to the present invention.

도 8 은 본 발명에 따른 텍스트 동기화를 위한 동기신호 검출 장치를 휴대용디지털 재생 장치의 DSP 에 구현하는 경우의 내부 구성도이다.8 is an internal configuration diagram when a synchronization signal detection apparatus for text synchronization according to the present invention is implemented in a DSP of a portable digital reproduction apparatus.

도 9 는 휴대용 디지털 재생 장치의 DSP 에 구현하는 경우의 내부 구성도이다.9 is an internal configuration diagram when the DSP is implemented in a portable digital playback device.

도 10 은 종래의 오디오 컨텐츠의 내용을 저장하는 텍스트 파일을 테이블의 형태로 재구성한 도면이다.FIG. 10 is a diagram illustrating a reconstruction of a text file storing contents of conventional audio content in the form of a table.

* 도면의 주요부분에 대한 부호의 설명 *Explanation of symbols on the main parts of the drawings

101 : 텍스트103 : 오디오 파일101: Text 103: Audio File

105 : 텍스트 동기화 장치107 : 매니저 프로그램105: text synchronization device 107: manager program

109 : 휴대용 저장장치109: Portable Storage

201 : 헤더203 : 부 정보201: header 203: minor information

205 : 메인 데이터207 : 스터핑 공간205 main data 207 stuffing space

상술한 목적을 달성하기 위하여, 본 발명은 오디오 컨텐츠가 저장된 제 1 부분, 적어도 상기 제 1 부분의 크기에 관한 정보를 포함하는 제 2 부분, 및 상기 제 1 부분과 상기 제 2 부분 이외의 부분인 제 3 부분을 각각 갖는 복수의 프레임들을 포함하는 오디오 파일에 동기신호를 삽입하는 방법에서, 프레임의 제 2 부분으로부터 상기 프레임의 제 1 부분의 크기에 관한 정보를 획득하는 단계; 상기 획득된 정보에 기초하여, 상기 프레임의 제 3 부분의 시작 위치 및 크기를 판정하는 단계; 및 상기 프레임의 상기 제 3 부분으로 동기신호의 적어도 일부를 삽입하는 단계를 포함하는 동기신호 삽입방법을 제공한다.In order to achieve the above object, the present invention provides a first portion in which audio content is stored, a second portion including at least information about the size of the first portion, and a portion other than the first portion and the second portion. CLAIMS 1. A method for inserting a synchronization signal into an audio file comprising a plurality of frames each having a third portion, the method comprising: obtaining information about a size of the first portion of the frame from a second portion of the frame; Determining a starting position and size of a third portion of the frame based on the obtained information; And inserting at least a portion of a sync signal into the third portion of the frame.

여기서, 상기 제 1 부분은 상기 오디오 파일의 헤더 정보를 포함하고, 상기 제 2 부분은 상기 오디오 컨텐츠를 포함하고, 상기 제 3 부분은 상기 오디오 파일의 오디오 컨텐츠 재생에 사용되지 않는 부분이다. 또한, 상기 제 3 부분은 동기신호의 존재 여부를 나타내는 영역 및 상기 동기신호의 내용을 나타내는 영역을 포함한다.Here, the first part includes header information of the audio file, the second part includes the audio content, and the third part is a part which is not used to play audio content of the audio file. In addition, the third part includes an area indicating whether a synchronization signal is present and an area indicating the content of the synchronization signal.

또한, 상기 동기신호는 상기 프레임의 상기 제 1 부분에 대응하는 텍스트의 위치에 관한 정보를 포함할 수도 있으며, 상기 프레임의 상기 제 3 부분으로 동기신호의 적어도 일부를 삽입하는 단계는, 상기 프레임의 상기 제 3 부분으로의 동기신호의 삽입 여부를 결정하는 단계; 및 동기신호의 불삽입 결정에 응답하여, 상기 프레임의 상기 제 3 부분으로 상기 프레임의 상기 제 1 부분에 대응하는 텍스트 정보를 삽입하는 단계를 포함할 수도 있다.The synchronization signal may also include information regarding the position of the text corresponding to the first portion of the frame, and inserting at least a portion of the synchronization signal into the third portion of the frame comprises: Determining whether to insert a synchronization signal into the third portion; And in response to the non-insertion determination of the synchronization signal, inserting text information corresponding to the first portion of the frame into the third portion of the frame.

또한, 상기 프레임의 상기 제 3 부분으로 동기신호의 적어도 일부를 삽입하는 단계는, 상기 제 3 부분에서의 동기신호 삽입 공간과 동기신호의 크기를 비교하여, 상기 제 3 부분에서의 상기 동기신호 삽입 공간이 상기 동기신호의 크기보다 작은 경우, 상기 동기신호 삽입 공간과 동일한 크기만큼의 상기 동기신호의 부분을 상기 제 3 부분으로 삽입하는 것이 바람직하다.In the inserting of at least a part of the synchronization signal into the third portion of the frame, the synchronization signal insertion space in the third portion may be compared with the magnitude of the synchronization signal to insert the synchronization signal in the third portion. When the space is smaller than the size of the synchronization signal, it is preferable to insert a portion of the synchronization signal with the same size as the synchronization signal insertion space into the third portion.

또한, 상기 오디오 컨텐츠는 상기 텍스트를 TTS (Text-to-Speech) 변환하여 생성될 수도 있다.In addition, the audio content may be generated by converting the text to text-to-speech (TTS).

한편, 본 발명은 오디오 컨텐츠가 저장된 제 1 부분, 적어도 상기 제 1 부분의 크기에 관한 정보를 포함하는 제 2 부분, 및 상기 제 1 부분과 상기 제 2 부분이외의 부분인 제 3 부분을 각각 갖는 복수의 프레임들을 포함하는 오디오 파일로부터 동기신호를 검출하는 방법에서, 상기 제 1 부분의 크기에 관한 정보에 기초하여, 상기 제 3 부분의 시작 위치와 크기에 관한 정보를 추출하는 단계; 상기 제 3 부분을 분석하여, 동기신호의 존재 여부를 판정하는 단계; 및 동기신호의 존재 판정에 응답하여, 상기 제 3 부분으로부터 동기신호의 적어도 일부를 획득하는 단계를 포함하는 동기신호 검출방법을 제공한다.Meanwhile, the present invention has a first part in which audio content is stored, a second part including information on the size of at least the first part, and a third part which is a part other than the first part and the second part, respectively. A method of detecting a synchronization signal from an audio file including a plurality of frames, the method comprising: extracting information about a start position and a size of the third portion based on the information about the size of the first portion; Analyzing the third portion to determine whether a synchronization signal is present; And in response to determining the presence of the synchronization signal, obtaining at least a portion of the synchronization signal from the third portion.

여기서, 상기 제 1 부분은 상기 오디오 파일의 헤더 정보를 포함하고, 상기 제 2 부분은 상기 오디오 컨텐츠를 포함하고, 상기 제 3 부분은 상기 오디오 파일의 오디오 컨텐츠 재생에 사용되지 않는 부분이다. 또한, 상기 제 3 부분은 동기신호의 존재 여부를 나타내는 영역, 및 상기 동기신호의 내용을 나타내는 영역을 포함한다.Here, the first part includes header information of the audio file, the second part includes the audio content, and the third part is a part which is not used to play audio content of the audio file. In addition, the third part includes an area indicating whether a synchronization signal is present and an area indicating the content of the synchronization signal.

또한, 동기신호의 부존재 판정에 응답하여, 상기 제 3 부분으로부터 텍스트 정보를 추출하는 단계를 더 포함할 수도 있으며, 동기신호의 내용을 분석한 후, 상기 분석에 기초하여, 대응하는 텍스트의 위치를 선택하는 단계를 더 포함할 수도 있다.The method may further include extracting text information from the third portion in response to the determination of the absence of the synchronization signal. After analyzing the contents of the synchronization signal, the location of the corresponding text may be determined based on the analysis. The method may further include selecting.

또한, 상기 제 3 부분으로부터 획득된 동기신호의 적어도 일부가 동기신호와 동일하지 않은 경우, 상기 동기신호의 적어도 일부를 후속하는 프레임의 동기신호의 적어도 일부와 결합하는 단계를 더 포함하는 것이 바람직하다.The method may further include combining at least a portion of the synchronization signal with at least a portion of the synchronization signal of a subsequent frame when at least a portion of the synchronization signal obtained from the third portion is not the same as the synchronization signal. .

한편, 본 발명은 오디오 컨텐츠가 저장된 제 1 부분, 적어도 상기 제 1 부분의 크기에 관한 정보를 포함하는 제 2 부분, 및 상기 제 1 부분과 상기 제 2 부분이외의 부분인 제 3 부분을 각각 갖는 복수의 프레임들을 포함하는 오디오 파일로부터 동기신호를 검출하는 장치에서, 상기 제 1 부분의 크기에 관한 정보에 기초하여, 상기 제 3 부분의 시작 위치와 크기에 관한 정보를 추출하고, 상기 제 3 부분을 분석하여, 동기신호의 존재 여부를 판정하는 동기신호 존재 여부 판정부; 및 동기신호의 존재 판정에 응답하여, 상기 제 3 부분으로부터 동기신호의 적어도 일부를 획득하는 동기신호 획득부를 구비하는 동기신호 삽입장치를 제공한다.Meanwhile, the present invention has a first part in which audio content is stored, a second part including information on the size of at least the first part, and a third part which is a part other than the first part and the second part, respectively. In the apparatus for detecting a synchronization signal from an audio file including a plurality of frames, based on the information about the size of the first portion, extracting information about the start position and size of the third portion, and the third portion Analyzing the signal to determine whether there is a synchronization signal; And a synchronizing signal acquiring unit for acquiring at least a portion of the synchronizing signal from the third portion in response to the existence determination of the synchronizing signal.

이하, 첨부도면을 참조하여 본 발명의 바람직한 실시예에 대하여 보다 구체적으로 설명하면 다음과 같다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1 을 참조하면, 먼저, 오디오 파일 (103) 과 그에 대응하는 텍스트 (101) 가 텍스트 동기화 장치 (105) 에 입력된다. 입력된 정보를 이용하여, 텍스트 동기화 장치 (105) 에서는 사용자로부터 각 가사가 출력되어야 할 시점을 직접 입력받게 된다. 사용자로부터 입력받은 정보는 각각 출력하고자 하는 텍스트와 재생시간이 연결된 정보로 구성될 수 있다. 텍스트 동기화 장치 (105) 는, 본 발명의 따른 동기신호 삽입 방법에 따라서, 오디오 파일 (103) 의 소정의 위치에 대응하는 텍스트 출력을 위한 텍스트의 위치를 나타내는 정보를 삽입한다. 매니저 프로그램 (107) 에서는 텍스트 동기화 장치 (105) 로부터 동기화된 MP3 파일, 및 텍스트를 전송받아, 이를 휴대용 재생장치 (109) 로 다운로드한다.Referring to FIG. 1, first, an audio file 103 and text 101 corresponding thereto are input to the text synchronization device 105. Using the inputted information, the text synchronizing apparatus 105 directly receives a time point at which each lyrics should be output from the user. The information received from the user may be composed of information to which the text to be output and the playing time are connected, respectively. The text synchronizing apparatus 105 inserts information indicating the position of text for text output corresponding to the predetermined position of the audio file 103 according to the synchronization signal insertion method according to the present invention. The manager program 107 receives the synchronized MP3 file and the text from the text synchronizing apparatus 105, and downloads them to the portable playback apparatus 109.

그 후, 휴대용 재생장치 (109) 에서 오디오 파일 (103) 을 재생하는 경우에,오디오 파일 재생 중에 동기신호가 검출되면, 그 동기신호를 분석하여, 동기신호가 표시하는 텍스트의 위치를 찾아, 해당하는 문자열을 휴대용 재생 장치 (109) 의 디스플레이 수단을 통하여 출력하게 된다.Then, in the case of reproducing the audio file 103 in the portable playback device 109, if a synchronization signal is detected during the reproduction of the audio file, the synchronization signal is analyzed to find the position of the text indicated by the synchronization signal, The character string to be output through the display means of the portable playback device 109.

이하, 본 발명의 실시예에서는 음악 파일의 포맷을 MP3 로 하여 설명하지만, WMA, AAC, 및 AC3 등 다른 오디오 파일 포맷에 따라 저장된 음악 파일의 경우에도 본 발명의 따른 동기신호 삽입 방법을 적용 또는 응용할 수 있음을 당업자에게 자명하다.Hereinafter, although the format of the music file is described as MP3 in the embodiment of the present invention, in the case of music files stored according to other audio file formats such as WMA, AAC, and AC3, the synchronization signal insertion method according to the present invention may be applied or applied. It will be apparent to those skilled in the art that this can be accomplished.

도 2 는 MP3 프레임의 구조를 나타내는 도면이다. 도 2 를 참조로 MP3 프레임의 구조를 설명하면, MP3 오디오 파일은 복수의 프레임들의 연속으로 구성되며, 각 프레임은 12 비트의 동기 비트로 구성된 헤더 (201), 부 정보 (side information; 203), 메인 데이터 (205), 및 스터핑 공간 (207) 으로 구성된다.2 is a diagram illustrating a structure of an MP3 frame. Referring to FIG. 2, the structure of an MP3 frame includes an MP3 audio file composed of a plurality of consecutive frames, each frame comprising a header 201, side information 203, and main information having 12 bits of sync bits. Data 205, and stuffing space 207.

헤더 (201) 및 부 정보 (203) 에는 동기 (sync) 를 포함하여 프레임의 구성 등에 관한 전반적인 정보가 저장되어 있다. 메인 데이터 (205) 에는 허프만 코딩 (Huffman Coding) 방식에 따라 오디오 컨텐츠가 무손실 압축되어 저장된다. 무손실 압축된 메인 데이터 (205) 는 바이트 단위로 저장되게 되며, 허프만 코딩의 결과 오디오 컨텐츠의 내용이 전혀 포함되지 않는 잉여 비트가 발생하게 된다. 이와 같은 잉여 비트를 스터핑 비트 (stuffing bit) 라고 하고, 스터핑 비트의 부분을 스터핑 공간이라고 한다. 즉, 이 비트들은 음악의 재생시에 전혀 사용되지 않는 빈 공간이다. 스터핑 공간 (207) 은 메인 데이터 (205) 를 포함한 프레임의 크기를 바이트 단위가 되도록 하기 위한 비트이므로, 스터핑 공간 (207) 의크기는 오디오 컨텐츠를 허프만 코딩하여 생성한 메인 데이터 (205) 의 크기에 따라 결정된다.The header 201 and the sub information 203 store general information on the structure of the frame and the like, including sync. In the main data 205, audio content is losslessly compressed and stored according to a Huffman Coding scheme. The lossless compressed main data 205 is stored in units of bytes, and as a result of Huffman coding, redundant bits are generated in which the content of the audio content is not included at all. Such a surplus bit is called a stuffing bit, and a part of the stuffing bit is called a stuffing space. In other words, these bits are empty space that is not used at all when playing music. Since the stuffing space 207 is a bit for making the size of the frame including the main data 205 in units of bytes, the size of the stuffing space 207 is equal to the size of the main data 205 generated by Huffman coding audio content. Is determined accordingly.

이하에서 보다 상세히 설명하는 바와 같이, 본 발명에서는 이러한 프레임의 구조적인 특성을 이용하여 스터핑 공간에 동기신호를 삽입하게 된다.As will be described in more detail below, the present invention inserts a synchronization signal into the stuffing space by using the structural characteristics of the frame.

도 3 은 본 발명의 제 1 실시예에 따른 동기신호 삽입 과정을 나타낸 흐름도이다. 도 3 을 참조하면, 먼저, 재생할 MP3 오디오 파일이 선택되면, 이를 프레임 단위로 분할한다 (S301).3 is a flowchart illustrating a synchronization signal insertion process according to a first embodiment of the present invention. Referring to FIG. 3, first, when an MP3 audio file to be reproduced is selected, the MP3 audio file is divided into frames (S301).

분할된 각 프레임에 대하여, 프레임 분석이 수행된다 (S303). 프레임 분석은, 헤더 (201) 와 부 정보 (203) 를 분석하여, 메인 데이터 (205) 의 시작 위치와 그 크기에 관한 정보를 획득한다. 그 후, 메인 데이터 (205) 의 크기에 관한 정보에 기초하여, 스터핑 공간 (207) 의 크기 및 위치가 획득된다.For each divided frame, frame analysis is performed (S303). The frame analysis analyzes the header 201 and the sub information 203 to obtain information about the start position and the size of the main data 205. Then, based on the information about the size of the main data 205, the size and position of the stuffing space 207 is obtained.

메인 데이터 (205) 의 크기에 따라, 스터핑 공간 (207) 가 존재하지 않는다고 판정될 수도 있다. 이 경우에도, 동기신호를 삽입할 공간이 필요하다고 판단되는 경우 (S305), 스터핑 공간 (207) 을 위한 공간을 임의로 생성할 수도 있다 (S307). 이 때, 스터핑 공간을 위하여 1 바이트를 새롭게 할당하게 되며, 따라서 이후의 모든 프레임은 1 바이트만큼씩 뒤로 밀리도록 전체 프레임을 재구성하게 된다 (S309).Depending on the size of the main data 205, it may be determined that the stuffing space 207 does not exist. Also in this case, when it is determined that a space for inserting the synchronization signal is necessary (S305), a space for the stuffing space 207 may be arbitrarily generated (S307). At this time, one byte is newly allocated for the stuffing space, and thus, all subsequent frames are reconstructed so as to push backward by one byte (S309).

그 후, 해당 프레임에 동기신호가 삽입되어야 하는지가 판정된다 (S311). 동기신호의 삽입 여부는 사용자로부터 미리 입력받은 정보에 따라 판정될 수도 있다. 예컨대, 사용자는 오디오 파일을 재생하면서 어느 시점에서 텍스트의 어느부분을 출력해야 하는지를 텍스트 동기화 장치의 소정의 입력장치를 통하여 직접 입력할 수 있다. 또한, 후술하는 TTS 방식에 따르는 경우와 같이 자동적으로 판정될 수도 있다. 동기신호가 삽입되어야 하는 경우에는, 스터핑 공간에 동기신호를 삽입하게 된다 (S313). 동기신호의 크기는 일반적으로 스터핑 공간의 비트수보다 크기 때문에, 하나의 동기신호 전부를 하나의 스터핑 공간에 삽입하는 것이 아니라, 동기신호의 적어도 일부를 하나의 스터핑 공간에 삽입한다. 복수 개의 스터핑 공간에 하나의 동기신호를 삽입할 수도 있다. 예시적인 실시예에서, 스터핑 공간은 동기신호의 존재를 나타내는 부분, 및 동기신호의 내용으로서 텍스트의 위치 및 출력되는 텍스트의 문자수를 나타내는 부분을 포함한다. 동기신호 중 몇 비트를 해당 프레임에 삽입하는지는 주어진 스터핑 공간이 몇 비트인가에 따라 결정된다.Then, it is determined whether or not a synchronization signal should be inserted into the frame (S311). Whether to insert the synchronization signal may be determined according to information previously input from the user. For example, the user may directly input which part of the text should be output at some time while playing the audio file through a predetermined input device of the text synchronization device. It may also be automatically determined as in the case of the TTS method described later. When the synchronization signal is to be inserted, the synchronization signal is inserted into the stuffing space (S313). Since the size of the synchronization signal is generally larger than the number of bits of the stuffing space, not all of the one synchronization signal is inserted into one stuffing space, but at least a part of the synchronization signal is inserted into one stuffing space. One sync signal may be inserted into a plurality of stuffing spaces. In an exemplary embodiment, the stuffing space includes a portion indicating the presence of the synchronization signal, and a portion indicating the position of the text and the number of characters of the output text as the content of the synchronization signal. How many bits of the synchronization signal are inserted into the corresponding frame depends on how many bits are given in the stuffing space.

상술한 과정을 각 프레임에 대하여 반복함으로써, 프레임들로 구성된 오디오 파일에 동기신호를 삽입하게 된다.By repeating the above process for each frame, a synchronization signal is inserted into an audio file consisting of frames.

따라서, 상술한 구성을 통하여, 오디오 컨텐츠와 텍스트를 동기화시킬 수 있도록 동기신호를 오디오 파일에 삽입시키는 동기신호를 제공함으로써, 오디오 컨텐츠의 재생 및 그와 동기화되는 텍스트의 출력시에, 오디오 컨텐츠 재생장치에 과도한 리소스 소모가 발생하지 않을 수 있게 된다.Therefore, through the above-described configuration, by providing a synchronization signal for inserting a synchronization signal into the audio file so as to synchronize the audio content and the text, the audio content reproduction apparatus at the time of reproduction of the audio content and the output of the text synchronized therewith. Excessive resource consumption may not occur in the system.

다음으로, 도 4 및 5 을 참조하여 본 발명의 제 2 실시예에 대하여 설명하기로 한다. 도 4 는 본 발명의 제 2 실시예에 따른 동기신호 삽입 과정을 나타낸 흐름도이다.Next, a second embodiment of the present invention will be described with reference to FIGS. 4 and 5. 4 is a flowchart illustrating a synchronization signal insertion process according to a second embodiment of the present invention.

도 4 에 도시하지는 않았지만, 도 3 의 S301 내지 S309 단계들이 도 4 의 S411 단계 이전에 동일하게 존재하지만, 도시 및 설명의 편의상 생략하기로 한다.Although not shown in FIG. 4, steps S301 to S309 of FIG. 3 exist the same as before step S411 of FIG. 4, but will be omitted for convenience of illustration and description.

먼저, 동기신호가 삽입될 필요가 있는지가 판정된다 (S411).First, it is determined whether the synchronization signal needs to be inserted (S411).

동기신호가 삽입될 필요가 없는 경우, 스터핑 공간에 텍스트를 삽입한다 (S415). 텍스트 문자열의 길이는 일반적으로 스터핑 공간의 비트수보다 크기 때문에, 주어진 텍스트 문자열 전부를 하나의 스터핑 공간에 삽입하는 것이 아니라, 텍스트 문자열의 적어도 일부를 하나의 스터핑 공간에 삽입한다. 즉, 복수 개의 스터핑 공간에 하나의 텍스트 문자열을 삽입한다.If the synchronization signal does not need to be inserted, text is inserted into the stuffing space (S415). Since the length of the text string is generally larger than the number of bits of the stuffing space, it inserts at least a portion of the text string into one stuffing space, rather than inserting all of the given text strings into one stuffing space. That is, one text string is inserted into a plurality of stuffing spaces.

도 5 은 본 발명의 제 2 실시예에 따른 동기신호가 삽입된 오디오 파일을 프레임 단위로 도시한 개략도이다. 도 5 에서, 오디오 파일을 프레임 단위로 구획하여 개략적으로 나타내었다. 각 프레임들에 대하여, 텍스트 정보 삽입에 해당하는 프레임에서는 텍스트 정보를 포함하고 있으며, 텍스트 출력 시점에 해당하는 프레임에서는 동기신호를 포함하고 있다. 텍스트 정보 삽입에 해당하는 프레임에도 스테핑 공간에 아무런 정보가 삽입되지 않을 수가 있으며, 이는 상술한 바와 같이, 대기 영역을 의미한다. 동기신호가 포함되어 있는 프레임의 재생시점이 그 이전의 프레임에 삽입된 텍스트를 출력하는 시점이 되도록, 먼저 출력할 텍스트 정보를 하나 이상의 프레임에 삽입한다. 출력할 텍스트 정보를 모두 삽입한 후에는 동기신호를 삽입할 때까지 대기상태에 있게 된다. 대기 상태에서는 프레임에 별도의 정보를 삽입하지 않고, 각 프레임에 존재하는 스터핑 비트를 모두 '0' 으로 초기화한다. 그 후, 현재 프레임의 위치가 텍스트를 출력해야할 시간 정보와 일치하게 되면 동기신호를 삽입한다.FIG. 5 is a schematic diagram illustrating an audio file in which a synchronization signal is inserted according to a second embodiment of the present invention in units of frames. In FIG. 5, the audio file is schematically illustrated in frames. For each frame, the frame corresponding to the text information insertion includes text information, and the frame corresponding to the text output time point includes a synchronization signal. No information may be inserted into the stepping space even in the frame corresponding to the text information insertion, which means the standby area as described above. The text information to be output is first inserted into one or more frames so that the playback time of the frame including the synchronization signal is the point of time of outputting the text inserted in the frame before it. After inserting all the text information to be output, it is in the waiting state until the synchronization signal is inserted. In the standby state, all stuffing bits present in each frame are initialized to '0' without inserting additional information into the frame. Then, when the position of the current frame coincides with the time information to output the text, a synchronization signal is inserted.

다시 도 4 로 되돌아와서, 동기신호가 삽입되어야 하는 경우, 스터핑 공간에 동기신호를 삽입하게 된다 (S413). 도 3 를 참조하여 상술한 바와 같이, 동기신호의 크기는 일반적으로 스터핑 공간의 비트수보다 크기 때문에, 하나의 동기신호 전부를 하나의 스터핑 공간에 삽입할 수도 있지만, 동기신호의 적어도 일부를 하나의 스터핑 공간에 삽입할 수도 있다. 즉, 복수 개의 스터핑 공간에 하나의 동기신호를 삽입할 수도 있다. 스터핑 공간에 삽입되는 동기신호는 동기신호의 존재를 나타내는 부분만을 포함하는 것으로 충분하다. 오디오 파일의 재생시에 있어서, 동기신호가 검출된 프레임의 이전 프레임들의 스터핑 공간에 저장된 정보가 텍스트 정보들의 조각이기 때문에, 이들을 취합하면 동기신호의 존재 검출시에 디스플레이에 출력할 텍스트를 얻을 수 있기 때문이다.4, when the synchronization signal is to be inserted, the synchronization signal is inserted into the stuffing space (S413). As described above with reference to FIG. 3, since the size of the synchronization signal is generally larger than the number of bits of the stuffing space, all of one synchronization signal may be inserted into one stuffing space. It can also be inserted into the stuffing space. That is, one synchronization signal may be inserted into the plurality of stuffing spaces. It is sufficient that the synchronization signal inserted into the stuffing space includes only a portion indicating the presence of the synchronization signal. In the reproduction of the audio file, since the information stored in the stuffing space of the previous frames of the frame in which the synchronization signal is detected is a piece of text information, combining these results in the text to be output to the display upon detection of the presence of the synchronization signal. to be.

상술한 과정을 각 프레임에 대하여 반복함으로써, 프레임들로 구성된 오디오 파일에 동기신호 및 오디오 컨텐츠에 대응하는 텍스트를 삽입하게 된다.By repeating the above process for each frame, text corresponding to the synchronization signal and the audio content is inserted into the audio file consisting of the frames.

한편, 본 발명에 따른 오디오 파일과 가사 텍스트를 동기화시키는 과정은 TTS (Text-to-Speech) 엔진을 이용하여 생성된 것일 수도 있다. 도 6 은 TTS 기술로 생성된 음성 파일과 텍스트를 동기화시키는 과정을 도시한 개념도이다.Meanwhile, the process of synchronizing the audio file and the lyrics text according to the present invention may be generated using a text-to-speech (TTS) engine. 6 is a conceptual diagram illustrating a process of synchronizing text with a voice file generated by the TTS technology.

TTS 는 텍스트를 음성 합성하여 음성 파일로 만드는 기술로, 텍스트 문자를 오디오 파일로 변환함에 있어서, TTS 엔진 (603) 은 각 나라의 언어에 대한 최소 발음 단위로 음소 DB 를 구축한 후, 텍스트 문자의 앞뒤 맥락을 고려하여 검색된 음소 DB 를 합성하여 음성신호를 생성한다. 도 1 을 참조하여 상술한 본 발명의 구성에서는 사용자로부터 오디오 파일과 동기화시키기 위한 텍스트의 위치를 직접 입력받아야 하지만, TTS 에 의한 음성 합성의 경우에는 음성 파일의 생성과 동시에 그와 대응되는 텍스트 파일에서의 텍스트의 위치가 자동적으로 파악되기 때문에, 별도의 사용자 입력 과정은 불필요하다.TTS is a technology for speech synthesis of text into a voice file. In converting text characters into audio files, the TTS engine 603 builds a phoneme DB in minimum pronunciation units for each language of the country, and then A voice signal is generated by synthesizing the retrieved phoneme DB considering the context of front and back. In the configuration of the present invention described above with reference to FIG. 1, the position of text for synchronizing with an audio file must be directly input from a user. However, in the case of speech synthesis by TTS, a text file corresponding to the voice file is generated at the same time. Because the location of the text is automatically detected, no separate user input process is necessary.

이하, 본 발명에 따른 동기신호 검출 과정을 설명하기로 한다.Hereinafter, a synchronization signal detection process according to the present invention will be described.

MP3 오디오 파일은 메모리에 저장되어 있다. MP3 오디오 파일에 대한 재생 명령에 응답하여, 메모리로부터 MP3 오디오 파일의 정보가 판독된다 (S701). 판독된 MP3 오디오 파일은 MP3 스트림의 형식으로 프레임 분석을 위하여 제공된다.MP3 audio files are stored in memory. In response to the playback command for the MP3 audio file, information of the MP3 audio file is read from the memory (S701). The read MP3 audio file is provided for frame analysis in the form of an MP3 stream.

그 후, MP3 스트림의 형식으로 전송된 오디오 파일을 프레임 단위로 분할한다 (S703). 각 프레임은 그 이전 프레임의 위치 및 각 프레임의 헤더 및 부 정보로부터 판정될 수 있다.Thereafter, the audio file transmitted in the form of an MP3 stream is divided in units of frames (S703). Each frame can be determined from the position of the previous frame and the header and sub information of each frame.

그 후, 각 프레임에 대하여 헤더 및 부 정보를 사용하여 오디오 컨텐츠의 크기를 추출한다. 오디오 컨텐츠의 크기에 기초하여, 스터핑 공간의 비트 크기 및 위치를 알 수 있으므로, 이에 따라 스터핑 공간의 비트 크기 및 위치 파악이 가능하다. 즉, 스터핑 공간에 대한 정보가 식별된다 (S705). 그 후, 스터핑 공간의 존재 여부 및 (존재하는 경우에) 위치 및 크기에 관한 정보가 동기신호 및 텍스트 구성을 위하여 제공된다.Then, the size of the audio content is extracted using the header and sub information for each frame. Since the bit size and position of the stuffing space can be known based on the size of the audio content, it is possible to grasp the bit size and position of the stuffing space accordingly. That is, information about the stuffing space is identified (S705). Thereafter, information about the presence of the stuffing space and the position and size (if any) is provided for the synchronization signal and text construction.

그 후, 검출된 동기신호의 내용을 분석하여, 동기신호 및 텍스트를 구성하게된다 (S707). 상기 제 1 실시예의 경우에는, 동기신호가 표시하고 있는 텍스트 파일에서의 텍스트의 위치 및 표시해야 하는 문자열의 길이를 결정하여, 해당 문자열 부분을 텍스트 파일로부터 판독한다. 한편, 텍스트가 MP3 오디오 파일에 포함되어 있는 상기 제 2 실시예의 경우에는, 동기신호가 존재하지 않는 경우에, 스터핑 공간의 비트 내용을 판독하여, 이를 별도의 메모리 공간에 연속적으로 저장하고, 동기신호의 존재가 검출되는 경우에 메모리 공간에 저장된 내용을 텍스트로서 출력하게 된다. 텍스트로 출력된 후에는, 상기 내용은 메모리 공간에서 제거된다. 그 후, 텍스트로 구성된 문자열은 LCD 로의 출력을 위하여 제공된다.Thereafter, the content of the detected sync signal is analyzed to form a sync signal and a text (S707). In the case of the first embodiment, the position of the text in the text file indicated by the synchronization signal and the length of the character string to be displayed are determined, and the character string portion is read from the text file. On the other hand, in the case of the second embodiment in which the text is included in the MP3 audio file, when there is no synchronization signal, the bit content of the stuffing space is read out and stored in a separate memory space continuously, and the synchronization signal When the presence of is detected, the content stored in the memory space is output as text. After output as text, the contents are removed from the memory space. Then, the text string is provided for output to the LCD.

그 후, LCD 컨트롤러 (미도시) 는 LCD 에 현재 출력되어 있는 문자열을 지우고 새로운 문자열을 출력하도록 LCD 를 제어한다 (S709). 이 경우에, LCD 에 동시에 출력가능한 문자열보다 긴 텍스트를 출력해야 하는 경우라면, 자동으로 문자열이 오른쪽에서 왼쪽으로 스크롤되도록 할 수 있으며, 이러한 스크롤 과정은 당업자라면 누구나 알 수 있다.Thereafter, the LCD controller (not shown) controls the LCD to erase the string currently output on the LCD and output a new string (S709). In this case, if it is necessary to output text longer than a string that can be printed on the LCD at the same time, the character string can be automatically scrolled from right to left, and this scrolling process is known to those skilled in the art.

도 7 의 동기신호 검출 장치는 도 8 및 9 와 같이 디지털 휴대용 재생 장치에서 구현될 수 있다. DSP 에 구현되는 것이 일반적이나, 텍스트 동기화 작업은 MICOM 에서 모든 외부 장치를 제어하고 있으므로 MICOM 에 리소스가 충분히 남아 있다면, 도 8 과 같이 MICOM 에 구현하는 것이 유리하다. 본 발명에서 제안한 방법으로 동기화를 구현할 경우에 소요되는 처리 속도와 메모리가 매우 작기 때문에 MICOM에서 처리해도 충분히 가능하다.The sync signal detecting apparatus of FIG. 7 may be implemented in a digital portable playback apparatus as shown in FIGS. 8 and 9. Although it is common to implement in the DSP, the text synchronization operation is controlled by all external devices in MICOM, so if there are enough resources in MICOM, it is advantageous to implement in MICOM as shown in FIG. Since the processing speed and memory required to implement synchronization by the method proposed by the present invention are very small, it is possible to process in MICOM.

도 8 은 본 발명에 따른 텍스트 동기화를 위한 동기신호 검출 장치를 휴대용디지털 재생 장치의 DSP 에 구현하는 경우의 내부 구성도이며, 도 9 는 휴대용 디지털 재생 장치의 DSP 에 구현하는 경우의 내부 구성도이다.8 is an internal configuration diagram when the synchronization signal detection apparatus for text synchronization is implemented in the DSP of the portable digital reproduction apparatus, and FIG. 9 is an internal configuration diagram when the DSP is implemented in the portable digital reproduction apparatus. .

도 8 과 9 는 일반적인 재생장치의 내부 구성도로, 사용자가 재생버튼을 눌렀을 때, 마이콤에서는 재생할 파일 이름을 가져온다. 재생할 파일이름을 가져온 다음에는 그 파일의 데이터를 읽어서 버퍼에 전달을 하고, DSP 에서는 버퍼에 있는 압축된 데이터를 복호화해서 스피커를 통해서 음악을 들려주게 된다.8 and 9 are internal configuration diagrams of a general playback apparatus. When a user presses a play button, the microcomputer brings a file name to be played back. After getting the name of the file to play, the data of the file is read and transferred to the buffer, and the DSP decodes the compressed data in the buffer to play music through the speaker.

이 과정에 가사나 재생되는 파일의 음성 정보를 액정에 표출하는 본 발명을 삽입하게 되면 전체 구조가 다음과 같이 변경된다. 마이콤에서 재생할 파일을 가져오는 과정은 동일하다. 재생할 파일을 가져온 다음에 재생 파일로부터 읽은 데이터를 버퍼에 전달하고, 전달한 데이터에 동기 신호가 있는지 없는지를 동기신호 검출기에서 찾게 된다. 이때, 동기 신호 검출기에서 동기 신호를 발견하게 되면 마이콤의 컨트롤러에서 동기 신호를 발견했으며, 발견된 동기신호의 내용이 무엇인지를 알려주게 된다. 마이콤의 LCD 컨트롤러에서는 액정 화면에 동기 신호 검출기에서 알려온 정보를 내보내게 된다.In this process, when the present invention which displays the voice information of the lyrics or the file to be reproduced on the liquid crystal is inserted, the overall structure is changed as follows. The process of importing a file to be played in Micom is the same. After retrieving the file to be reproduced, the data read from the file is transferred to the buffer, and the synchronization signal detector determines whether the transferred data has a synchronization signal. In this case, when the synchronization signal is found by the synchronization signal detector, the synchronization signal is found by the controller of the microcomputer, and the content of the found synchronization signal is informed. The microcomputer's LCD controller sends the information from the sync signal detector to the liquid crystal display.

도 8 과 도 9 의 차이점은 동기 신호 검출기가 내부의 어디에 위치하느냐만 다른데, 이는 휴대용 재생 장치의 구조적인 특성에 맞게 어떤 형태를 취하든 전체적인 실행 절차는 동일하게 동작된다.The difference between FIG. 8 and FIG. 9 differs only in where the sync signal detector is located. The overall execution procedure is the same regardless of the shape of the portable reproducing apparatus.

본 발명을 특정 애플리케이션에 대한 특정 실시예를 참조하여 설명하였다. 당업계의 통상의 지식을 가지고 본 교시에 접근하는 자는 그 범위 내의 부가적인 변형, 애플리케이션, 및 실시예를 알 수 있다.The invention has been described with reference to specific embodiments for specific applications. Those of ordinary skill in the art, having access to the present teachings, may know additional variations, applications, and embodiments within the scope.

따라서, 첨부된 청구범위는 본 발명의 사상 내의 이러한 임의의, 그리고 모든 응용, 변형, 및 실시예를 커버하도록 의도된다.Accordingly, the appended claims are intended to cover any and all such applications, modifications, and embodiments within the spirit of the invention.

본 발명은 디지털 휴대용 재생장치에 텍스트 동기화 장치를 첨가함으로써, 음악 파일 또는 음성 파일을 재생하면서 자동으로 재생되는 음악의 가사 혹은 음성 내용을 액정에 표시할 수 있는 기능을 제공한다.The present invention provides a function capable of displaying lyrics or voice content of music automatically reproduced while reproducing a music file or an audio file by adding a text synchronization device to the digital portable player.

본 발명은 압축된 파일이 재생되는 중에 음악 파일에 은닉되어 있는 동기신호를 실시간으로 검출하여 컨텐츠 파일의 현재 재생되는 시점과 동기를 맞추어 액정화면에 디스플레이한다. 따라서, 사용자는 재생장치의 액정화면을 통해서 현재 재생되는 내용을 확인할 수 있게 된다. 또한, 텍스트 정보와 텍스트가 출력되어야 할 시점까지 모든 정보를 디지털 컨텐츠에 은닉함으로써 사용자가 부가적으로 텍스트 파일이나 기타 정보를 별도로 저장하지 않아도 된다.The present invention detects a synchronization signal hidden in a music file in real time while the compressed file is being reproduced, and displays it on the LCD screen in synchronization with the current playback time of the content file. Therefore, the user can check the content currently being played back through the LCD screen of the playback device. In addition, by hiding all the information in the digital content until the text information and the text should be output, the user does not need to additionally store the text file or other information.

특히, 본 발명은 일반 음악의 가사를 비롯해서 외국어 학습을 위한 교재 내용까지 포괄적으로 활용할 수 있기 때문에 어학 학습용 디지털 휴대용 재생 장치에 매우 효과적으로 이용될 수 있다.In particular, the present invention can be used effectively in a digital portable playback device for language learning because the present invention can comprehensively utilize textbooks for foreign language learning as well as lyrics of general music.

Claims

A plurality of frames each having a first portion in which audio content is stored, a second portion containing information about the size of the first portion, and a third portion that is a portion other than the first portion and the second portion; In a method of inserting a synchronization signal into an audio file,

Obtaining information about the size of the first portion of the frame from the second portion of the frame;

Determining a starting position and size of a third portion of the frame based on the obtained information; And

And inserting at least a portion of the synchronization signal into the third portion of the frame.

The method of claim 1,

The first portion includes header information of the audio file,

The second portion includes the audio content,

And the third portion is a portion not used for reproduction of audio contents of the audio file.

The method of claim 1,

And the third portion includes an area indicating whether a synchronization signal is present and an area indicating the content of the synchronization signal.

The method of claim 1,

And the synchronization signal includes information about a position of text corresponding to the first portion of the frame.

The method of claim 1,

Inserting at least a portion of a sync signal into the third portion of the frame,

Determining whether to insert a synchronization signal into the third portion of the frame; And

And in response to the non-insertion determination of the synchronization signal, inserting text information corresponding to the first portion of the frame into the third portion of the frame.

The method according to any one of claims 1 to 5,

Comparing the size of the synchronization signal insertion space and the synchronization signal in the third portion, when the synchronization signal insertion space in the third portion is smaller than the size of the synchronization signal, the same size as the synchronization signal insertion space And inserting a portion of the synchronization signal into the third portion.

The method of claim 1,

And the audio content is generated by text-to-speech (TTS) conversion of the text.

A plurality of frames each having a first portion in which audio content is stored, a second portion containing information about the size of the first portion, and a third portion that is a portion other than the first portion and the second portion; In the method for detecting a synchronization signal from an audio file,

Extracting information about a start position and a size of the third part based on the information about the size of the first part;

Analyzing the third portion to determine whether a synchronization signal is present; And

In response to determining the presence of the synchronization signal, obtaining at least a portion of the synchronization signal from the third portion.

The method of claim 8,

The first portion includes header information of the audio file,

The second portion includes the audio content,

The method of claim 8,

And in response to determining the absence of a synchronization signal, extracting text information from the third portion.

The method of claim 8,

And analyzing the content of the synchronization signal, and then selecting a position of a corresponding text based on the analysis.

The method according to claim 8 to 12,

And combining at least a portion of the synchronization signal with at least a portion of the synchronization signal of a subsequent frame if at least a portion of the synchronization signal obtained from the third portion is not the same as the synchronization signal. Signal detection method.

A plurality of frames each having a first portion in which audio content is stored, a second portion containing information about the size of the first portion, and a third portion that is a portion other than the first portion and the second portion; In an apparatus for detecting a synchronization signal from an audio file,

On the basis of the information on the size of the first portion, information on the start position and size of the third portion is extracted, and the third portion is analyzed to determine whether there is a synchronization signal presence plate to determine the presence of a synchronization signal. government; And

And a synchronizing signal acquiring unit for acquiring at least a portion of the synchronizing signal from the third portion in response to the determination of the existence of the synchronizing signal.