KR101379697B1

KR101379697B1 - Apparatus and methods for synchronized E-Book with audio data

Info

Publication number: KR101379697B1
Application number: KR1020120017351A
Authority: KR
Inventors: 박원연; 채승엽; 심재웅
Original assignee: (주)케이디엠티
Priority date: 2012-02-21
Filing date: 2012-02-21
Publication date: 2014-04-02
Also published as: KR20130095932A

Abstract

본 발명에 관한 전자책 저작 장치는 음성 데이터와 문서 데이터를 입력 받고, 상기 음성 데이터의 정보를 음성 인식하여 문장 단위의 텍스트(Text)로 변환한 후 상기 음성 데이터에 대한 시간정보(Time Code)와 함께 제 1 문장 풀(Pool)로 저장하고, 상기 문서 데이터로부터 텍스트를 문장 단위로 추출하여 제 2 문장 풀로 저장하는 문장 추출부, 미리 설정된 유사도를 기준으로, 상기 제 2 문장 풀의 문장(제 2 문장)과 매칭(Matching)되는 상기 제 1 문장 풀의 문장(제 1 문장)을 판별하는 문장 매칭부, 상기 제 1 문장의 시간정보(Time Code)를 상기 제 2 문장과 동기화한 후 저장하는 동기화부, 상기 제 1 문장 풀, 상기 제 2 문장 풀 및 동기화 정보를 저장하는 메모리부 및 상기 제 1 문장 풀 또는 상기 제 2 문장 풀 중 어느 하나에서, 매칭할 문장이 없을 때까지 상기 문장 매칭부 및 상기 동기화부를 통해 매칭 및 동기화를 제어하는 제어부를 포함한다.
본 발명에 의하면 음성 데이터와 문서 데이터가 문장 단위로 동기화 된 전자책을 제작할 수 있고, 문서 데이터에서 문장의 위치 정보와 음성 데이터를 동기화함으로써 문서 데이터에서 임의의 문장을 선택할 경우에도 바로 동기화된 음성 데이터에 접근할 수 있다.The e-book authoring apparatus according to the present invention receives voice data and document data, converts the information of the voice data into a text in a sentence unit, and converts the time data of the voice data into a time code. A sentence extractor which stores the first sentence pool together, extracts text from the document data in sentence units, and stores the text as a second sentence pool. The sentence of the second sentence pool is based on a preset similarity. A sentence matching unit for determining a sentence (first sentence) of the first sentence pool matched with a sentence), and synchronizing and storing time information of the first sentence after synchronizing with the second sentence The sentence, every one of the first sentence pool, the second sentence pool and the memory unit for storing the synchronization information, and the first sentence pool or the second sentence pool, the sentence every until there is no sentence to match Unit and a control unit for controlling the matching and synchronization via the synchronization unit.
According to the present invention, it is possible to produce an e-book in which voice data and document data are synchronized in units of sentences, and by synchronizing the position information and voice data of sentences in the document data, even if a random sentence is selected from the document data, the synchronized voice data is immediately available. Can be accessed.

Description

Apparatus and methods for synchronized E-Book with audio data}

본 발명은 오디오 데이터(음성 데이터)와 동기화 된 전자책 저작 장치 및 방법에 관한 것이다. 보다 구체적으로는, 전자책의 텍스트와 동일한 내용의 음성 데이터가 텍스트와 동기화 되어 있는 전자책 저작 장치 및 그 방법에 관한 것이다.
The present invention relates to an e-book authoring apparatus and method synchronized with audio data (voice data). More specifically, the present invention relates to an electronic book authoring apparatus and a method in which voice data having the same content as the text of the electronic book is synchronized with the text.

전자책(E-book, Electronic Book)은 컴퓨터, 전자책 리더기, 스마트 폰 등의 디지털 장비를 사용하여 책 또는 잡지 등과 같은 형태로 보여질 수 있는 디지털화 된 콘텐트이다.E-books (E-books) are digitalized content that can be viewed in the form of books or magazines using digital devices such as computers, e-book readers, and smart phones.

종래 전자책은 일반적으로 텍스트 형태의 문서를 전자책 디바이스의 디스플레이 화면에 표시하고, 사용자는 해당 디스플레이를 통해 텍스트를 볼 수 있는 방식으로 이용되고 있다. BACKGROUND ART Conventional e-books generally display text-type documents on the display screen of the e-book device, and a user can use the text to view the text through the display.

또한, 전자책과 비슷한 유형으로 오디오 북(Audio Book) 콘텐트 방식도 존재하고, 이는 텍스트 형태의 문서를 TTS(Text to speech)로 변환하거나 성우가 해당 텍스트를 읽은 음성을 디지털 파일 형태로 저장한 콘텐트 방식을 말한다.In addition, there is an audio book content method similar to an e-book, which converts a text-type document into TTS (Text to speech) or stores a voice that reads the text in a digital file format. Say the way.

전자 교육(E-Learning) 등의 분야에서는 단순히 텍스트 형태 또는 오디오 형태의 전자책뿐만 아니라, 학습 효과를 위해서 도 3과 같이 텍스트 기반의 전자책과 해당 내용에 대한 음성 데이터를 문장 단위로 동기화할 필요성이 있다. In the fields of e-learning, not only text or audio e-books, but also text-based e-books and voice data about the contents as shown in FIG. There is this.

예를 들어, 글자를 배우는 어린 아이, 외국어를 공부하는 학생의 경우 같은 내용에 대한 텍스트 형태와 음성 형태의 데이터를 동시에 청취 및 시청함으로써 효율적인 학습을 할 수 있다. 또한, 텍스트 형태의 전자책 중간에서 임의의 문장에 대한 같은 내용의 오디오북을 바로 동시에 접근 가능하다면, 학습 능률이 향상되는 것을 기대할 수 있다.For example, a young child learning a letter or a student studying a foreign language can learn efficiently by simultaneously listening to and viewing data in a text form and a voice form about the same content. In addition, if an audiobook of the same content for any sentence is immediately accessible at the same time in the middle of the text-type e-book, the learning efficiency can be expected to be improved.

하지만, 종래의 전자책 및 오디오북은 서로 별개로 존재하여 발전해 왔으며, 이를 통합한 경우에도 국내공개특허 10-2009-0060040(“대화형디지털오디오북 제작시스템 및 그 운영방법”)처럼 텍스트와 동영상이 문장 단위로 동기화되지 못하고 단지 챕터 또는 페이지 단위로 동기화되어 있는 문제점이 있다.
However, conventional e-books and audiobooks have been developed separately from each other, and even in the case of integrating them, text and video as in Korean Patent Publication No. 10-2009-0060040 (“Interactive digital audiobook production system and its operation method”). There is a problem in that this unit is not synchronized in units of sentences, but only in chapters or pages.

국내공개특허 제2009-0060040호 (2009.06.11)Domestic Publication No. 2009-0060040 (2009.06.11) 발명의 명칭: 대화형 디지털 오디오북 제작시스템 및 그 운영방법Title of the Invention: Interactive Digital Audiobook Production System and Its Operation Method

본 발명이 이루고자 하는 기술적 과제는 음성 데이터와 문서 데이터가 문장 단위로 동기화 된 전자책을 제작할 수 있는 전자책 저작 장치 및 방법을 제공하는 데 있다.An object of the present invention is to provide an e-book authoring apparatus and method for producing an e-book in which voice data and document data are synchronized in units of sentences.

또한, 문서 데이터에서 문장의 위치 정보와 음성 데이터를 동기화함으로써, 문서 데이터에서 임의의 문장을 선택할 경우에도 바로 동기화된 음성 데이터에 접근할 수 있는 전자책 저작 장치 및 방법을 제공하는 데 있다.
Another object of the present invention is to provide an e-book authoring apparatus and method for synchronizing positional information of a sentence and text data in document data so that the synchronized voice data can be directly accessed even when an arbitrary sentence is selected from the document data.

본 발명의 과제를 해결하기 전자책 저작 장치의 전자책 저작 방법에 따르면, (a) 음성 데이터와 문서 데이터를 입력 받는 단계, (b) 상기 음성 데이터의 정보를 음성 인식하여 문장 단위의 텍스트(Text)로 변환한 후 상기 음성 데이터에 대한 시간정보(Time Code)와 함께 제 1 문장 풀(Pool)로 저장하고, 상기 문서 데이터로부터 텍스트를 문장 단위로 추출하여 제 2 문장 풀로 저장하는 단계, (c) 미리 설정된 유사도를 기준으로, 상기 제 2 문장 풀의 문장(제 2 문장)과 매칭(Matching)되는 상기 제 1 문장 풀의 문장(제 1 문장)을 판별하는 단계, (d) 상기 제 1 문장의 시간정보(Time Code)를 상기 제 2 문장과 동기화한 후 저장하는 단계 및 (e) 상기 제 1 문장 풀 또는 상기 제 2 문장 풀 중 어느 하나에서, 매칭할 문장이 없을 때까지 상기 (c) 및 상기 (d) 단계를 반복하는 단계를 포함한다.
According to an e-book authoring method of an e-book authoring apparatus, (a) receiving voice data and document data, (b) text-recognized text by speech recognition of the voice data; And converting the voice data into a first sentence pool along with time information of the voice data, extracting text from the document data in units of sentences, and storing the second sentence pool as a second sentence pool, (c (C) determining a sentence (first sentence) of the first sentence pool that matches the sentence (second sentence) of the second sentence pool based on a preset similarity level, (d) the first sentence Synchronizing and storing time information of the time code with the second sentence and (e) in any one of the first sentence pool and the second sentence pool, until there is no sentence to match; And repeating step (d). The.

또한 상기 전자책 저작 방법은, 상기 (b) 단계에서 상기 문서 데이터로부터 텍스트를 문장 단위로 추출할 때, 상기 문서 데이터에 대한 추출된 문장의 문장 위치 정보도 함께 추출하여 제 2 문장 풀로 저장할 수 있다.In addition, in the e-book authoring method, when text is extracted from the document data in units of sentences in step (b), sentence position information of the extracted sentence with respect to the document data may also be extracted and stored as a second sentence pool. .

또한 상기 전자책 저작 방법은, 상기 (c)단계 이후에, (c-2) 상기 제 1 문장과 상기 제 2 문장의 길이가 다른 경우, 상기 제 1 문장을 기준으로 상기 제 2 문장을 조정하는 단계를 더 포함할 수 있다.In addition, in the e-book authoring method, after the step (c), when the length of the first sentence and the second sentence is different from (c-2), the second sentence is adjusted based on the first sentence. It may further comprise a step.

또한 상기 전자책 저작 방법은, 상기 (d) 단계에서 상기 동기화는 상기 시간정보(Time Code)를 상기 제 2 문장의 문장 위치 정보와 동기화할 수 있다.In addition, in the e-book authoring method, in the step (d), the synchronization may synchronize the time information with the sentence position information of the second sentence.

본 발명의 과제를 해결하기 전자책 저작 장치는 음성 데이터와 문서 데이터를 입력 받고, 상기 음성 데이터의 정보를 음성 인식하여 문장 단위의 텍스트(Text)로 변환한 후 상기 음성 데이터에 대한 시간정보(Time Code)와 함께 제 1 문장 풀(Pool)로 저장하고, 상기 문서 데이터로부터 텍스트를 문장 단위로 추출하여 제 2 문장 풀로 저장하는 문장 추출부, 미리 설정된 유사도를 기준으로, 상기 제 2 문장 풀의 문장(제 2 문장)과 매칭(Matching)되는 상기 제 1 문장 풀의 문장(제 1 문장)을 판별하는 문장 매칭부, 상기 제 1 문장의 시간정보(Time Code)를 상기 제 2 문장과 동기화한 후 저장하는 동기화부, 상기 제 1 문장 풀, 상기 제 2 문장 풀 및 동기화 정보를 저장하는 메모리부 및 상기 제 1 문장 풀 또는 상기 제 2 문장 풀 중 어느 하나에서, 매칭할 문장이 없을 때까지 상기 문장 매칭부 및 상기 동기화부를 통해 매칭 및 동기화를 제어하는 제어부를 포함한다.
To solve the problem of the present invention, an e-book authoring apparatus receives voice data and document data, and recognizes the voice data information, converts the voice data into text in a sentence unit, and then converts time information of the voice data into time. A sentence extractor that stores the first sentence pool along with a code, extracts text from the document data in a sentence unit, and stores the text as a second sentence pool, based on a preset similarity, based on a preset similarity. A sentence matching unit for determining a sentence (first sentence) of the first sentence pool matching the (second sentence), and synchronizing a time code of the first sentence with the second sentence In any one of a synchronization unit for storing, the first sentence pool, the second sentence pool, and a memory unit for storing synchronization information, and the first sentence pool or the second sentence pool, until there are no sentences to match. Group comprises a sentence matching unit and a control unit for controlling the matching and synchronization via the synchronization unit.

본 발명에 의하면, 음성 데이터와 문서 데이터가 문장 단위로 동기화 된 전자책을 제작할 수 있다.According to the present invention, an electronic book in which voice data and document data are synchronized in sentence units can be produced.

또한, 본 발명에 의하면 문서 데이터에서 문장의 위치 정보와 음성 데이터를 동기화함으로써, 문서 데이터에서 임의의 문장을 선택할 경우에도 바로 동기화된 음성 데이터에 접근할 수 있다.
In addition, according to the present invention, by synchronizing the positional information of the sentence and the speech data in the document data, the synchronized speech data can be directly accessed even when an arbitrary sentence is selected from the document data.

도 1은 전자책 저장 장치의 블록도이다.
도 2는 전자책 저장 장치의 전자책 저작 방법의 순서도이다.
도 3은 본 발명의 개념을 나타내는 그림이다.1 is a block diagram of an e-book storage device.
2 is a flowchart of an e-book authoring method of an e-book storage device.
3 is a diagram illustrating the concept of the present invention.

본 발명의 목적과 기술적 구성 및 그에 따른 작용 효과에 관한 자세한 사항은 본 발명의 명세서에 첨부된 도면에 의거한 이하의 상세한 설명에 의해 보다 명확하게 이해될 것이다. 첨부된 도면을 참조하여 본 발명에 따른 실시예를 상세하게 설명한다.DETAILED DESCRIPTION OF THE EMBODIMENTS Reference will now be made in detail to the preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment according to the present invention will be described in detail with reference to the accompanying drawings.

본 명세서에서 개시되는 실시예들은 본 발명의 범위를 한정하는 것으로 해석되거나 이용되지 않아야 할 것이다. 이 분야의 통상의 기술자에게 본 명세서의 실시예를 포함한 설명은 다양한 응용을 갖는다는 것이 당연하다. 따라서, 본 발명의 상세한 설명에 기재된 임의의 실시예들은 본 발명을 보다 잘 설명하기 위한 예시적인 것이며 본 발명의 범위가 실시예들로 한정되는 것을 의도하지 않는다.The embodiments disclosed herein should not be construed or interpreted as limiting the scope of the present invention. It will be apparent to those of ordinary skill in the art that the description including the embodiments of the present specification has various applications. Accordingly, any embodiment described in the Detailed Description of the Invention is illustrative for a better understanding of the invention and is not intended to limit the scope of the invention to embodiments.

도면에 표시되고 아래에 설명되는 기능 블록들은 가능한 구현의 예들일 뿐이다. 다른 구현들에서는 상세한 설명의 사상 및 범위를 벗어나지 않는 범위에서 다른 기능 블록들이 사용될 수 있다. 또한, 본 발명의 하나 이상의 기능 블록이 개별 블록들로 표시되지만, 본 발명의 기능 블록들 중 하나 이상은 동일 기능을 실행하는 다양한 하드웨어 및 소프트웨어 구성들의 조합일 수 있다.The functional blocks shown in the drawings and described below are merely examples of possible implementations. In other implementations, other functional blocks may be used without departing from the spirit and scope of the following detailed description. Also, although one or more functional blocks of the present invention are represented as discrete blocks, one or more of the functional blocks of the present invention may be a combination of various hardware and software configurations that perform the same function.

도 1은 본 발명에 따른 전자책 저작 장치의 전자책 저작 방법의 블록도를 나타낸다. 1 is a block diagram of an e-book authoring method of an e-book authoring apparatus according to the present invention.

도 1에 따르면 본 발명에 따른 전자책 저작 장치는, 음성 데이터와 문서 데이터를 입력 받고, 상기 음성 데이터 및 상기 문서 데이터를 처리하여 각각 함께 제 1 문장 풀(Pool) 및 제 2 문장 풀에 저장하는 문장 추출부(110), 상기 제 2 문장 풀의 문장(제 2 문장)과 매칭(Matching)되는 상기 제 1 문장 풀의 문장(제 1 문장)을 판별하는 문장매칭부(120), 상기 제 1 문장의 시간정보(Time Code)를 상기 제 2 문장과 동기화한 후 저장하는 동기화부(130), 매칭할 문장이 없을 때까지 상기 문장매칭부(120) 및 상기 동기화부(130)를 통해 매칭 및 동기화를 제어하는 제어부(140)를 포함한다.According to FIG. 1, an e-book authoring apparatus according to the present invention receives voice data and document data, processes the voice data and the document data, and stores the voice data and the document data together in a first sentence pool and a second sentence pool, respectively. The sentence extracting unit 110, a sentence matching unit 120 for determining a sentence (first sentence) of the first sentence pool that matches the sentence (second sentence) of the second sentence pool, and the first sentence. A synchronization unit 130 for synchronizing and storing time information of a sentence after synchronizing with the second sentence, and matching through the sentence matching unit 120 and the synchronization unit 130 until there is no sentence to match. It includes a control unit 140 for controlling the synchronization.

문장 추출부(110)는 음성 데이터와 문서 데이터를 입력 받고, 입력 받은 데이터들에 대해서 음성 데이터의 정보를 음성 인식하여 문장 단위의 텍스트(Text)로 변환한 후 상기 음성 데이터에 대한 시간정보(Time Code)와 함께 제 1 문장 풀(Pool)로 저장하고, 상기 문서 데이터로부터 텍스트를 문장 단위로 추출하여 제 2 문장 풀로 저장한다.The sentence extracting unit 110 receives the voice data and the document data, and recognizes the information of the voice data with respect to the input data, converts the information into text in a sentence unit, and then time information of the voice data. Code is stored in the first sentence pool, and text is extracted from the document data in sentence units and stored in the second sentence pool.

본 발명의 요지를 흐리지 않기 위해서 음성 인식 기술을 통하여 음성 데이터를 텍스트로 변환하는 구체적인 방법 및 종래 기술에 대해서는 본 명세서에서 그 설명을 생략한다.In order not to obscure the gist of the present invention, a detailed method and a conventional method of converting voice data into text through voice recognition technology will be omitted.

문장 추출부(110)에서 입력 받는 상기 음성 데이터는 특정 텍스트를 사람 또는 TTS(Text to Speech) 장치가 읽은 음성 데이터를 의미하며, 상기 문서 데이터는 상기 특정 텍스트를 포함하는 디지털 파일 형태의 데이터를 의미한다. The voice data received from the sentence extractor 110 refers to voice data read by a person or a text to speech (TTS) device, and the document data refers to data in the form of a digital file including the specific text. do.

문장 추출부(110)는 상기 음성 데이터를 텍스트로 변환할 때 문장 단위로 변환을 수행한다. 즉, 음성 데이터를 어법적(語法的)으로 적절한 문장 단위의 텍스트로 변환한다. The sentence extracting unit 110 converts the voice data into text in units of sentences. That is, the speech data is grammatically converted into text in an appropriate sentence unit.

이 경우, 음성 데이터 전체를 텍스트로 변환한 후에 문장 단위로 구분할 수도 있고, 음성 데이터의 텍스트 변환과 문장 단위 구분을 동시에 수행할 수도 있다.In this case, the entire voice data may be converted into text and then divided into sentence units, or the text data of the voice data and sentence division may be simultaneously performed.

텍스트 문장의 의미 분석 및 문법 분석을 통해서 음성 데이터를 문장 단위로 변환할 수도 있으며, 음성 데이터의 공백 길이 및 문장 의미 분석을 동시에 이용해 음성 데이터의 문장 단위 변환을 수행할 수도 있다.Speech data may be converted into units of sentences through semantic analysis and grammatical analysis of text sentences, or sentence units of voice data may be converted by simultaneously analyzing a blank length of a voice data and sentence meaning analysis.

음성 데이터의 공백 길이를 이용하는 경우 다음과 같은 방법을 이용할 수 있다. 음성 데이터를 노이즈 제거 필터링 및 일정 역치값(Threshold) 이하의 신호를 제거한 후에 전체 음성 데이터 스캔(Scan)을 통해 음성 신호가 없는 공백 부분들의 길이를 확인한다. 공백 부분들의 길이 분포를 조사하여, 단어(Word) 사이의 공백 길이를 확정하고, 해당 공백 길이를 넘는 공백이 입력 받는 음성 데이터에 존재하는 부분을 기준으로 좌우 데이터를 별개의 문장으로 구분할 수 있다. In the case of using a blank length of voice data, the following method may be used. After the voice data is filtered through noise elimination filtering and a signal below a predetermined threshold, the entire voice data scan is used to check the lengths of the blank portions without the voice signal. The length distribution of the spaces may be examined to determine the length of spaces between the words, and the left and right data may be divided into separate sentences based on a portion of the space in which the space exceeding the space exists in the input voice data.

또한, 음성 데이터의 공백 길이와 더불어 문장의 어법 및 의미 분석을 통해서, 상기 문장 단위 구분을 조정할 수 있다.In addition, the sentence unit division may be adjusted by analyzing the syntax and semantics of the sentence together with the blank length of the voice data.

문장 추출부(110)는 문장 단위의 텍스트로 변환한 데이터를 상기 음성 데이터에 대한 시간정보(Time Code)와 함께 제 1 문장 풀(Pool)로 저장한다. The sentence extracting unit 110 stores the data converted into text in sentence units as a first sentence pool along with time information on the voice data.

시간정보는 해당 문장 변환 데이터에 해당하는 음성 데이터가 전체 음성 데이터에서 위치하는 시간에 대한 정보를 의미한다.The time information refers to information on a time at which voice data corresponding to the sentence conversion data is located in the entire voice data.

예를 들어 10분 길이의 음성 데이터를 문장 단위 텍스트로 변환할 경우, 특정 A 문장이 상기 음성 데이터의 시작으로부터 4분 위치에 존재하는 경우를 생각할 수 있다. 이 경우, 시간 정보는 “00:04:00”이라는 절대적인 방식으로 표현될 수도 있고, 전체 음성 데이터에 대한 상대적인 방식으로 “240/600(sec)”로 표현될 수도 있다.For example, in the case of converting speech data having a length of 10 minutes into sentence text, a case in which a specific A sentence exists 4 minutes from the start of the speech data can be considered. In this case, the time information may be expressed in an absolute manner of “00:04:00” or “240/600 (sec)” in a manner relative to the entire voice data.

문장 추출부(110)는 입력 받은 상기 문서 데이터로부터 텍스트를 문장 단위로 추출하여 제 2 문장 풀로 저장한다.The sentence extractor 110 extracts text from the input document data in sentence units and stores the text in a second sentence pool.

상기 문서 데이터는 일반적으로 사용되는 MS-Word 파일 또는 Adobe PDF파일 등의 문서 파일, HTML 등의 웹문서 또는 기존의 전자책(E-Book)일 수 있다. 또한, 이와 동등하게 텍스트를 포함하여 디지털 방식으로 제작된 디지털 스트림(Stream) 방식의 입력도 가능하며, 인터넷 웹 문서의 주소 형식으로 된 인터넷 링크(Link)를 입력으로 받는 것도 가능하다고 볼 것이다. The document data may be a document file such as a commonly used MS-Word file or an Adobe PDF file, a web document such as HTML, or an existing e-book. In addition, it is also possible to digitally input digital streams, including text, and to receive an Internet link in the form of an address of an Internet web document.

문서 데이터를 문장 단위로 텍스트로 추출할 때 텍스트 문장의 의미 분석 및 문법 분석을 이용할 수도 있고, 문서 데이터에 존재하는 구두점(句讀點)을 동시에 이용할 수도 있다.When extracting document data as text in sentence units, it is possible to use semantic analysis and grammatical analysis of text sentences, or to simultaneously use punctuation marks present in document data.

또한, 상기 문장 추출부(110)는 상기 문서 데이터로부터 텍스트를 문장 단위로 추출할 때 상기 문서 데이터에 대한 추출된 문장의 문장 위치 정보도 함께 추출하여 제 2 문장 풀로 저장할 수 있다. 즉, 추출되는 문장이 존재하는 문서의 해당 페이지에 대한 상하좌우 좌표를 함께 추출하여 저장할 수 있다.In addition, when extracting text from the document data in sentence units, the sentence extractor 110 may also extract sentence position information of the extracted sentence with respect to the document data and store it as a second sentence pool. That is, up, down, left, and right coordinates of the corresponding page of the document in which the extracted sentence exists may be extracted and stored together.

문장매칭부(120)는 미리 설정된 유사도를 기준으로, 상기 제 2 문장 풀의 문장(제 2 문장)과 매칭(Matching)되는 상기 제 1 문장 풀의 문장(제 1 문장)을 판별한다.The sentence matching unit 120 determines a sentence (first sentence) of the first sentence pool that is matched with a sentence (second sentence) of the second sentence pool based on a preset similarity.

문장의 매칭은 양 문장이 서로 일치하는 지 확인하는 것을 의미하며, 일반적으로 사용되는 텍스트 매칭 기술을 이용할 수 있다.Matching a sentence means checking whether both sentences match each other, and a commonly used text matching technique may be used.

유사도는 양 문장의 유사한 정도를 의미하며, 유사도 100%는 띄어쓰기 및 구두점을 제외한 양 문장의 글자가 모두 동일한 경우를 의미한다. Similarity refers to the degree of similarity between the two sentences, and 100% similarity refers to the case where the letters of both sentences are identical except for spacing and punctuation.

“내가 여섯 살 때에 <체험담>이라는 제목의, 원시림에 관한 책에서 멋있는 그림을 보았다.”라는 제 2 문장을 예로 들면, 상기 텍스트를 사람이 읽은 음성 데이터를 텍스트로 다시 변환할 경우 “내가 여섯 살 때 체험담이라는 제목의 원시림에 관한 책에서 멋있는 그림을 보였다.”라는 제 1 문장이 될 수 있다. (밑줄은 제 2 문장과의 차이를 보여주기 위해 표현된 것이다.) For example, in the second sentence, “When I was six years old, I saw a wonderful picture in a book about primitive forests, titled <Experience>,” when I converted the text from human-reading speech data back into text, When I was living, I saw a wonderful picture in a book about the primeval forest titled Experience Story. ”It could be the first sentence. (The underline is shown to show the difference from the second sentence.)

이 경우 양 문장은 100% 일치하지는 않지만 매칭되는 문장으로 판별하는 것이 옳으며, 음성 데이터의 음성 인식도를 고려하여 양 문장을 매칭되는 문장으로 판별하기 위한 유사도를 미리 설정할 필요가 있다.In this case, the two sentences are not 100% identical, but it is correct to determine the matching sentences, and it is necessary to set the similarity for discriminating both sentences as matching sentences in consideration of the speech recognition degree of the voice data.

또한, 문장매칭부(120)는 매칭된 상기 제 1 문장과 상기 제 2 문장의 길이가 다른 경우, 상기 제 1 문장을 기준으로 상기 제 2 문장을 조정할 수도 있다.The sentence matching unit 120 may adjust the second sentence based on the first sentence when the matched length of the first sentence and the second sentence is different.

즉, 문서 데이터에서 제 2 문장을 추출할 때 텍스트 문장의 의미 분석, 문법 분석 및 구두점(句讀點)을 이용하지만, 추출된 문장이 어법적으로 완전할 수 없기 때문에 인간이 읽은 음성 데이터를 기준으로 추출된 문장을 보정하는 것이다.That is, when the second sentence is extracted from the document data, semantic analysis, grammar analysis, and punctuation of the text sentence are used, but since the extracted sentence cannot be grammatically perfect, it is based on the speech data read by humans. It is to correct the extracted sentence.

상기 제 2 문장의 조정은, 상기 제 1 문장이 상기 제 2 문장보다 짧은 경우 상기 제 1 문장의 말미를 기준으로 상기 제 2 문장을 분할하고, 상기 제 2 문장의 상기 말미 이후 부분을 다시 상기 제 2 문장 풀로 저장할 수 있다. The adjusting of the second sentence may include splitting the second sentence based on the end of the first sentence when the first sentence is shorter than the second sentence, and re-adding a part after the end of the second sentence to the second sentence. Can be stored as a 2 sentence pool.

또한, 상기 제 1 문장이 상기 제 2 문장보다 긴 경우 상기 제 2 문장 풀의 다음 매칭할 문장(제 3 문장)을 상기 제 1 문장의 말미를 기준으로 분할하여, 상기 제 3 문장의 상기 말미까지의 부분을 상기 제 2 문장의 말미에 더하고 상기 말미 이후 부분을 다시 상기 제 2 문장 풀로 저장할 수 있다.Further, when the first sentence is longer than the second sentence, the next sentence (third sentence) to be matched in the second sentence pool is divided based on the end of the first sentence, and until the end of the third sentence. The part of may be added to the end of the second sentence, and the part after the end may be stored back into the second sentence pool.

예를 들어 제 1 문장이 “내가 여섯 살 때에 체험담이라는 제목의 원시림에 관한 책에서 멋있는 그림을 보았다.”이고 제 2 문장이 “내가 여섯 살 때에 <체험담>이라는 제목의, 원시림에 관한 책에서 멋있는 그림을 보았다 그 그림은 보아뱀에 관한 것이었다.”일 경우, 상기 제 1 문장의 말미 ‘보았다’를 기준으로 상기 제 2 문장에서 그 이후 부분인 “그 그림은 보아뱀에 관한 것이었다.”를 분리하여 다시 제 2 문장 풀로 저장한다.For example, the first sentence says, “When I was six years old, I saw a wonderful picture in a book on the primeval forest titled Experience Story.” The second sentence says, “When I was six years old, I saw a nice picture. The picture was about a boa constrictor. ”Based on the end of the first sentence,“ I saw, ”the later part of the second sentence“ The picture was about a boa constrictor. ” Separate and store as second sentence pool again.

또 다른 예로 제 1 문장이 “내가 여섯 살 때에 체험담이라는 제목의 원시림에 관한 책에서 멋있는 그림을 보았는데 그 그림은 보아뱀에 관한 것이었다.”이고 제 2 문장이 “내가 여섯 살 때에 체험담이라는 제목의 원시림에 관한 책에서 멋있는 그림을 보았는데.” 일 경우, 제 2 문장 풀의 다음 매칭할 문장(제 3 문장) “그 그림은 보아뱀에 관한 것이었다.”를 상기 제 1 문장의 말미 ‘보았다’를 기준으로 상기 제 3 문장에서 그 말미 부분까지 분리하여 상기 제 2 문장의 말미에 더한다. 따라서, 상기 제 2 문장은 “내가 여섯 살 때에 체험담이라는 제목의 원시림에 관한 책에서 멋있는 그림을 보았는데 그 그림은 보아뱀에 관한 것이었다.”가 될 수 있다.In another example, the first sentence says, “When I was six years old, I saw a nice picture in a book on the primeval forest titled Experience Story, which was about a boa constrictor.” I saw a nice picture in a book on the primeval forest. ”, The next sentence in the second sentence pool (third sentence)“ The picture was about a boa constrictor. ” Based on the above, the first sentence is separated from the third sentence and added to the end of the second sentence. Thus, the second sentence could say, "When I was six years old, I saw a nice picture in a book about the primeval forest titled Experience Story, which was about a boa constrictor."

제 1 문장과 유사도 이상 일치하는 제 2 문장이 판별되면, 동기화부(130)는 상기 제 1 문장의 시간정보(Time Code)를 상기 제 2 문장과 동기화한 후 저장한다.When a second sentence having a similarity or higher match with the first sentence is determined, the synchronization unit 130 synchronizes the time information of the first sentence with the second sentence and stores the same.

동기화는 문서 데이터에서 특정 문장을 선택할 경우, 해당 문장을 읽은 음성 문장이 음성 데이터에서 위치하는 시간 정보를 알 수 있도록 하는 것을 의미한다.Synchronization means that when a specific sentence is selected from the document data, the time sentence in which the voice sentence reading the sentence is located in the voice data is known.

예를 들면, 상기 제 1 문장의 시간 정보를 제 2 문장의 부가적인 메타 정보로서 입력 받은 문서 데이터에 제 2 문장과 함께 저장하거나, 별도의 파일에 제 2 문장과 함께 저장할 수 있다. For example, the time information of the first sentence may be stored together with the second sentence in document data received as additional meta information of the second sentence, or may be stored together with the second sentence in a separate file.

또한, 동기화부(130)는 상기 제 1 문장의 시간정보(Time Code)를 상기 제 2 문장의 문장 위치 정보와 동기화할 수도 있다.In addition, the synchronization unit 130 may synchronize time information of the first sentence with sentence position information of the second sentence.

예를 들어, 제 2 문장이 문서 데이터의 4페이지에서 왼쪽 모서리를 원점으로 삼아 오른쪽으로 100mm, 위쪽으로 150mm에 위치하고 있다면, 상기 제 2 문장의 위치 정보는 “Page 4:X_100, Y_150”과 같이 표현할 수 있다. 또한, 상기 제 2 문장의 위치 정보를 제 1 문장의 시간정보(예를 들면, “00:04:00”)와 함께 별도의 파일에 저장할 수 있다.For example, if the second sentence is located at 100 mm to the right and 150 mm to the right with the left edge as the origin on page 4 of the document data, the position information of the second sentence may be expressed as “Page 4: X_100, Y_150”. Can be. In addition, the location information of the second sentence may be stored in a separate file together with the time information of the first sentence (eg, “00:04:00”).

제어부(140)는 상기 제 1 문장 풀 또는 상기 제 2 문장 풀 중 어느 하나에서 매칭할 문장이 없을 때까지 매칭 및 동기화를 수행하여, 입력 받은 음성 데이터, 문서 데이터의 매칭 및 동기화를 완료한다.The controller 140 performs matching and synchronization until there are no sentences to match in either the first sentence pool or the second sentence pool, thereby completing matching and synchronization of the received voice data and document data.

도 2에 따른 본 발명에 따른 전자책 저작 장치의 전자책 저작 방법의 실시예를 설명한다. 도 2에 따르면 본 발명의 전자책 저작 장치의 전자책 저작 방법은, (a) 음성 데이터와 문서 데이터를 입력 받는 단계, (b) 상기 음성 데이터 및 상기 문서 데이터를 처리하여 각각 함께 제 1 문장 풀(Pool) 및 제 2 문장 풀에 저장하는 단계, (c) 상기 제 2 문장 풀의 문장(제 2 문장)과 매칭(Matching)되는 상기 제 1 문장 풀의 문장(제 1 문장)을 판별하는 단계, (d) 상기 제 1 문장의 시간정보(Time Code)를 상기 제 2 문장과 동기화한 후 저장하는 단계 및 (e) 매칭할 문장이 없을 때까지 상기 (c) 및 상기 (d) 단계를 반복하는 단계를 포함한다.An embodiment of an e-book authoring method of an e-book authoring apparatus according to the present invention according to FIG. 2 will be described. According to FIG. 2, the e-book authoring method of the e-book authoring apparatus of the present invention comprises the steps of: (a) receiving voice data and document data, (b) processing the voice data and the document data, respectively, and pooling a first sentence together; (Pool) and storing in the second sentence pool, (c) determining a sentence (first sentence) of the first sentence pool matching the sentence (second sentence) of the second sentence pool (d) synchronizing and storing the time code of the first sentence with the second sentence and (e) repeating steps (c) and (d) until there are no sentences to match It includes a step.

전자책 저작 장치는 음성 데이터와 문서 데이터를 입력 받고(S210), 입력 받은 데이터들에 대해서 음성 데이터의 정보를 음성 인식하여 문장 단위의 텍스트(Text)로 변환한 후 상기 음성 데이터에 대한 시간정보(Time Code)와 함께 제 1 문장 풀(Pool)로 저장하고, 상기 문서 데이터로부터 텍스트를 문장 단위로 추출하여 제 2 문장 풀로 저장한다(S220).The e-book authoring apparatus receives the voice data and the document data (S210), and recognizes the information of the voice data with respect to the input data, converts it into text in a sentence unit, and then stores time information about the voice data ( Time code) is stored in the first sentence pool, and text is extracted from the document data in sentence units and stored in the second sentence pool (S220).

텍스트 문장의 의미 분석 및 문법 분석을 통해서 음성 데이터를 문장 단위로 구분할 수도 있으며, 음성 데이터의 공백 길이 및 문장 의미 분석을 동시에 이용해 음성 데이터의 문장 단위 구분을 수행할 수도 있다.Speech data may be divided into sentence units through semantic analysis and grammatical analysis of text sentences, or sentence units of voice data may be classified by simultaneously analyzing a blank length and sentence meaning analysis.

문장 단위의 텍스트로 변환된 데이터는 상기 음성 데이터에 대한 시간정보(Time Code)와 함께 제 1 문장 풀(Pool)에 저장된다. The data converted into text in sentence units is stored in the first sentence pool along with time information of the voice data.

또한, 상기 문서 데이터로부터 텍스트가 문장 단위로 추출될 때 상기 문서 데이터에 대한 추출된 문장의 문장 위치 정보도 함께 추출되어 제 2 문장 풀로 저장될 수 있다. 즉, 추출되는 문장이 존재하는 문서의 해당 페이지에 대한 상하좌우 좌표가 함께 추출되어 저장될 수 있다.In addition, when text is extracted from the document data in sentence units, sentence position information of the extracted sentence with respect to the document data may also be extracted and stored as the second sentence pool. That is, up, down, left, and right coordinates of the corresponding page of the document in which the extracted sentence exists may be extracted and stored together.

입력 받은 음성 데이터 및 문서 데이터로부터 문장들의 추출이 끝나면, 미리 설정된 유사도를 기준으로, 상기 제 2 문장 풀의 문장(제 2 문장)과 매칭(Matching)되는 상기 제 1 문장 풀의 문장(제 1 문장)을 판별한다(S230).After the extraction of the sentences from the input voice data and the document data, the sentences of the first sentence pool (first sentence) that are matched with the sentences (second sentence) of the second sentence pool based on a preset similarity ) Is determined (S230).

유사도는 양 문장의 유사한 정도를 의미하며, 100%는 띄어쓰기 및 구두점을 제외한 양 문장의 글자가 모두 동일한 경우를 의미한다.Similarity refers to the degree of similarity between two sentences, and 100% means that the letters of both sentences are identical except for spacing and punctuation.

또한, 상기 매칭되는 문장의 판별 이후에 매칭된 상기 제 1 문장과 상기 제 2 문장의 길이가 다른 경우, 상기 제 1 문장을 기준으로 상기 제 2 문장을 조정할 수도 있다.The second sentence may be adjusted based on the first sentence when the length of the matched first sentence and the second sentence is different after the determination of the matched sentence.

상기 제 2 문장의 조정은, 상기 제 1 문장이 상기 제 2 문장보다 짧은 경우, 상기 제 1 문장의 말미를 기준으로 상기 제 2 문장을 분할하고, 상기 제 2 문장의 상기 말미 이후 부분을 다시 상기 제 2 문장 풀로 저장할 수 있다. The adjusting of the second sentence may include: when the first sentence is shorter than the second sentence, dividing the second sentence based on the end of the first sentence, and repeating the part after the end of the second sentence. Can be stored as a second sentence pool.

또한, 상기 제 1 문장이 상기 제 2 문장보다 긴 경우, 상기 제 2 문장 풀의 다음 매칭할 문장(제 3 문장)을 상기 제 1 문장의 말미를 기준으로 분할하여, 상기 제 3 문장의 상기 말미까지의 부분을 상기 제 2 문장의 말미에 더하고, 상기 말미 이후 부분을 다시 상기 제 2 문장 풀로 저장할 수 있다.In addition, when the first sentence is longer than the second sentence, the next sentence (third sentence) to be matched in the second sentence pool is divided based on the end of the first sentence, and thus the end of the third sentence. The part up to may be added to the end of the second sentence, and the part after the end may be stored back into the second sentence pool.

제 1 문장과 유사도 이상 일치하는 제 2 문장이 판별되면, 상기 제 1 문장의 시간정보(Time Code)가 상기 제 2 문장과 동기화된 후 저장된다(S240).When a second sentence having a similarity or higher match with the first sentence is determined, time information of the first sentence is synchronized with the second sentence and then stored (S240).

또한, 제 1 문장의 시간정보(Time Code)가 제 2 문장의 문장 위치 정보와 동기화된 후 저장될 수도 있다.Also, time information of the first sentence may be stored after being synchronized with sentence position information of the second sentence.

상기 제 1 문장 풀 또는 상기 제 2 문장 풀 중 어느 하나에서, 매칭할 문장이 없을 때까지 매칭 및 동기화 단계를 반복 수행하여(S250), 입력 받은 음성 데이터 및 문서 데이터의 매칭, 동기화를 완료한다.In any one of the first sentence pool and the second sentence pool, the matching and synchronizing steps are repeatedly performed until there are no sentences to match (S250) to complete matching and synchronization of the received voice data and document data.

본 발명의 이해를 돕기 위해 설명된 실시예들에서 사용된 특정 용어들이 본 발명을 한정하는 것은 아니다. 본 발명은 통상의 기술자들에게 당연한 모든 구성 요소 및 동등한 가치를 갖는 모든 구성 요소를 포함할 수 있다.
The specific terminology used in the illustrated embodiments is not intended to limit the invention in order to facilitate understanding of the present invention. The present invention may include all components and equivalents of all components that are of ordinary skill in the art.

Claims

(a) receiving audio data and document data, respectively;
(b) converting the information of the voice data into a text in a sentence unit by recognizing the voice, and storing the information in the first sentence pool along with the time code of the voice data, and storing the document data. Extracting text from a sentence unit and storing the extracted text as a second sentence pool;
(c) determining a sentence (first sentence) of the first sentence pool that matches the sentence (second sentence) of the second sentence pool based on a preset similarity;
(d) synchronizing and storing time information of the first sentence in synchronization with the second sentence; And
(e) repeating steps (c) and (d) in either the first sentence pool or the second sentence pool until there are no sentences to match;
, &Lt; / RTI &
After step (c),
(c-2) adjusting the second sentence based on the first sentence when the length of the first sentence and the second sentence are different;
E-book authoring method of the e-book authoring device further comprising

The method according to claim 1,
When the text is extracted from the document data in units of sentences in step (b), sentence position information of the extracted sentence with respect to the document data is also extracted and stored as a second sentence pool. EBook authoring method

delete

3. The method of claim 2,
In the step (d), the synchronization synchronizes the time information with the sentence position information of the second sentence.

Receives voice data and document data, respectively, and recognizes the voice data information and converts the voice data into text in a sentence unit, and then first sentence pool together with time code of the voice data. A sentence extracting unit configured to extract the text from the document data in sentence units and store the extracted text as a second sentence pool;
A sentence matching unit determining a sentence (first sentence) of the first sentence pool that is matched with a sentence (second sentence) of the second sentence pool based on a preset similarity;
A synchronization unit configured to store time information of the first sentence in synchronization with the second sentence;
A memory unit for storing the first sentence pool, the second sentence pool, and synchronization information; And
A controller configured to control matching and synchronization through the sentence matching unit and the synchronization unit until there are no sentences to match in any one of the first sentence pool and the second sentence pool;
, &Lt; / RTI &
The sentence matching unit adjusts the second sentence based on the first sentence when the length of the matched first sentence and the second sentence is different.

6. The method of claim 5,
The sentence extracting unit extracts the sentence position information of the extracted sentence with respect to the document data when extracting text from the document data, and stores the second book as a second sentence pool.

delete

The method according to claim 6,
The synchronizing unit is an e-book authoring apparatus, characterized in that for synchronizing the time information (Time Code) with the sentence position information of the second sentence