KR20090055932A

KR20090055932A - Method, system and computer-readable recording medium for extracting text based on tag information

Info

Publication number: KR20090055932A
Application number: KR1020070122819A
Authority: KR
Inventors: 이윤현; 김규일; 박진수
Original assignee: 엔에이치엔(주)
Priority date: 2007-11-29
Filing date: 2007-11-29
Publication date: 2009-06-03
Also published as: KR100916814B1

Abstract

A text extracting method, a system thereof and a recording medium read through a computer based on tag information capable of easily obtaining information of a web page are provided to supply a TTS(Text To Speech) service to a user by extracting the text within a web page suitable for the intention of the user. A text pointer on a web page is recognized(S310). It determines whether a pair of tags surrounding a text which is pointed through the text pointer are a pair of tags for extraction of a predetermined text(S330). The text which is surrounded by the discriminated a pair of tags is extracted(S350). A TTS service is supplied to a user by extracting the text page within the web page.

Description

METHOD, SYSTEM AND COMPUTER-READABLE RECORDING MEDIUM FOR EXTRACTING TEXT BASED ON TAG INFORMATION}

본 발명은 웹 페이지의 작성에 이용된 마크업 언어의 태그 정보에 기초하여 텍스트를 추출하는 방법, 시스템, 및 상기 방법을 실행하기 위한 컴퓨터 프로그램을 기록하는 컴퓨터 판독 가능한 기록 매체에 관한 것이다. 보다 상세하게는, 본 발명은 웹 페이지 내의 텍스트를 추출한 후 이를 이용하여 음성 변환이나 번역 등의 텍스트 기반 서비스를 제공할 때에, 웹 페이지의 작성에 이용되는 마크업 언어의 태그 정보를 분석하고 이에 기초하여 가장 적절한 범위의 텍스트가 추출되도록 하는 방법, 시스템, 및 상기 방법을 실행하기 위한 컴퓨터 프로그램을 기록하는 컴퓨터 판독 가능한 기록 매체에 관한 것이다.The present invention relates to a method, a system for extracting text based on tag information of markup language used in the creation of a web page, and a computer readable recording medium for recording a computer program for executing the method. More specifically, the present invention analyzes the tag information of the markup language used for the creation of the web page, based on the extracted text when providing the text-based service such as speech conversion or translation using the extracted text in the web page A method, a system, and a computer readable recording medium for recording a computer program for executing the method, so that the text of the most suitable range is extracted.

근래에 들어, 인터넷 사용이 보편화 되면서 인터넷을 통한 다양한 정보의 획득이 가능해지고 있다. 웹 사이트를 통해 인터넷 서비스를 제공하는 업체는 더욱더 다양해져 가는 인터넷 사용자들의 욕구를 충족시키기 위해 다양한 종류의 서비스를 제공하고 있으며, 그러한 서비스의 종류 또한 하루가 다르게 증가하고 있는 추세이다.In recent years, as the use of the Internet is becoming more common, various types of information can be obtained through the Internet. Companies that provide Internet services through Web sites provide various kinds of services to meet the needs of more and more diverse Internet users, and the types of such services are also increasing day by day.

사용자들은 이러한 업체들이 제공하는 서비스를 다양한 형태로 접하고 있으며, 특히, 웹 사이트를 통해 뉴스 정보, 사전 정보, 전문 정보, 지역 정보, 쇼핑 정보 등의 다양한 인터넷 컨텐츠를 얻고자 한다.Users are exposed to the services provided by these companies in various forms, and in particular, they want to obtain various Internet contents such as news information, dictionary information, professional information, local information, and shopping information through web sites.

이러한 사용자들은 자신이 원하는 컨텐츠를 얻기 위해 웹 사이트를 통해 검색을 수행하고, 이를 통해 특정 웹 페이지에서 원하는 컨텐츠를 얻는 경우, 주로 텍스트로 이루어져 있는 해당 컨텐츠를 육안에 의해서 해독하게 되는 것이 일반적이다. 그러나, 사용자의 입장에서는 이렇듯 텍스트 위주로 제공되는 컨텐츠만을 이용하는 것은 멀티미디어 시대인 요즈음에 있어서는 달갑지 않은 일일 수 있고, 현실적으로는 웹 페이지가 담고 있는 정보의 양이 점점 많아짐에 따라 사용자가 텍스트의 형태로 제공 받은 컨텐츠를 해독하기 위하여 육안으로 그 텍스트를 모두 다 읽을 때까지 사용자 컴퓨터의 모니터와 같은 디스플레이 수단에서 시선을 떼지 말아야 하게 되는 문제점도 있다. 또한, 사용자 중에는 컨텐츠를 통하여 원하는 정보를 얻으면서 다른 일도 진행하고자 하는 멀티태스킹 욕구를 가진 자도 있을 수 있는데 이러한 욕구도 충족되기 어려운 측면이 있었다.Such users generally search through a web site to obtain desired content, and when the desired content is obtained from a specific web page, the user generally decodes the corresponding content mainly composed of text. However, from the user's point of view, using only text-based content may be unpleasant in the multimedia era, and in reality, as the amount of information contained in a web page increases, the user is provided in the form of text. There is also a problem in that it is necessary to keep an eye on a display means such as a monitor of a user computer until all of the text is read by the naked eye in order to decrypt the content. In addition, some users may have a multitasking desire to proceed with other tasks while obtaining desired information through the contents, but these desires have been difficult to satisfy.

한편, 근래에 들어, VoIP(Voice over IP) 기술, 음성 인식 기술, 음성 변환 기술, 음성 합성 기술, 자동 응답 시스템 등의 CTI(Computer Telephony Integration) 기술이 많은 관심을 끌고 있는 것 역시 사실인데, 이러한 기술들에 의하면 인터넷 환경에서도 사용자가 음성으로 지시를 내리고, 음성으로 정보를 제공 받으며, 음성으로 의사소통하는 진일보한 인터넷 서비스를 누릴 수 있게 될 것 으로 기대되고 있다.In recent years, computer telephony integration (CTI) technologies, such as voice over IP (VoIP) technology, voice recognition technology, voice conversion technology, voice synthesis technology, and answering machine, have also attracted much attention. According to the technologies, it is expected that even in the Internet environment, users can enjoy advanced Internet services that provide voice instructions, voice information, and voice communications.

이에 따라, 텍스트 위주의 컨텐츠 제공에 따른 문제를 해결하는 한편, CTI 기술에 폭넓게 이용하기 위하여 TTS(Text To Speech) 기술이 개발된 바 있다. TTS 기술은 음성 인식 기술보다 널리 쓰일 수 있는 기술로서, 각종 텍스트 정보를 음성으로 변환하여 제공하는 휴먼 인터페이스 기술이다. 웹 페이지에서의 TTS 기술은 주로 웹 페이지의 텍스트를 추출하고 이를 음성으로 변환하여 사용자에게 제공하는 방식으로 실현된다. 예를 들면, 사용자가 웹 페이지의 일정 위치에서 일정 시간 동안 마우스를 정지시키면 발생하는 마우스오버(mouse-over) 이벤트에 따라 그때의 마우스 포인터의 위치에 해당하는 단어를 추출한 후 이를 음성으로 변환하는 경우나, 사용자가 웹 페이지 상의 텍스트의 일정 부분을 드래깅(dragging)하여 이를 음성으로 변환하는 경우를 언급할 수 있다.Accordingly, the TTS (Text To Speech) technology has been developed to solve the problem of providing text-oriented content and to widely use the CTI technology. TTS technology is a technology that can be used more widely than voice recognition technology, and is a human interface technology that converts and provides various text information into voice. TTS technology in a web page is mainly realized by extracting the text of the web page, converting it into voice, and presenting it to the user. For example, when a user stops the mouse at a certain position on a web page for a predetermined time, the word corresponding to the position of the mouse pointer is extracted according to a mouse-over event. Alternatively, it may refer to a case in which a user drags a portion of text on a web page and converts it to a voice.

그러나, 현재 실현되고 있는, 웹 페이지를 통한 TTS 서비스는 완전한 휴먼 인터페이스 기술이라고 할 수 없다. 구체적으로 설명하면, 현재의 TTS 서비스는 사용자의 마우스오버 조작에 의해 인식된 위치의 단어만을 음성으로 변환하거나, 사용자로 하여금 직접 마우스를 드래깅하여 음성 변환을 원하는 만큼의 텍스트를 지정하도록 할 수 밖에 없는 문제가 있었다. 전자의 경우에는, 사용자의 의사와는 달리 일률적으로 마우스오버된 단어만이 음성으로 변환되는 문제가 있었다. 그리고, 후자의 경우에는, 사용자가 원하는 범위의 텍스트를 음성으로 변환시키기 위해서는 사용자가 개략적으로나마 육안으로 텍스트를 해독한 후 음성 변환의 대상이 되는 텍스트의 범위를 지정하여야 하는 관계로, 사용자가 직접 텍스트를 해독하여 야만 하는 경우를 가급적 배제하고자 하는 TTS 기술의 본지를 무색하게 하고, 또한, 위와 같은 지정에 추가적으로 시간이 소요되는 문제가 있었다.However, TTS services through web pages, which are currently realized, are not a complete human interface technology. Specifically, the current TTS service has no choice but to convert only the words of the locations recognized by the user's mouseover into speech, or allow the user to directly drag the mouse to designate as much text as desired. There was a problem. In the former case, unlike the user's intention, only the mouseover word is uniformly converted into voice. In the latter case, in order for the user to convert the text of the desired range into speech, the user must roughly decode the text and then designate the text range to be converted into speech. There was a problem in that the subject matter of the TTS technology intended to exclude the case that should be decoded as much as possible, and also takes additional time to the above designation.

또한, 위와 같은 문제점을 피하기 위하여 웹 페이지 내의 텍스트를 모두 추출하여 이를 기초로 하여 TTS 서비스를 제공하는 경우, 사용자가 원하지 않는 영역의 텍스트마저 모두 추출되는 문제점이 있을 수 있다.In addition, in order to avoid the above problems, if all the text in the web page is extracted and provided with the TTS service based on this, there may be a problem that even the text of the area that the user does not want to extract.

그리고, 사용자의 편의를 증진시키기 위하여 웹 페이지의 특정 영역을 지정하고, 그 텍스트를 추출하여 TTS 서비스를 제공하기 위한 버튼 등을 웹 페이지에 포함시키는 경우, TTS 서비스의 구현을 위해 기존의 웹 페이지를 폭넓게 수정해야 하는 어려움도 있을 수 있다.If a specific area of the web page is designated to enhance the user's convenience, and the text is extracted to include a button for providing the TTS service in the web page, the existing web page is implemented to implement the TTS service. There may also be difficulties in making extensive corrections.

따라서, 상기 문제점을 모두 해결하여 사용자 편의성을 높이기 위한 기술이 요구된다.Therefore, a technique for improving user convenience by solving all the above problems is required.

본 발명은 상술한 종래 기술의 문제점을 모두 해결하는 데에 그 목적이 있다.The present invention aims to solve all the problems of the prior art described above.

또한, 본 발명은 웹 페이지의 작성에 이용되는 마크업 언어의 태그 정보를 분석하고 이에 기초하여 능동적으로 텍스트를 추출하는 것에 그 목적이 있다.In addition, an object of the present invention is to analyze tag information of a markup language used for creating a web page and to actively extract text based on the tag information.

한편, 본 발명은 웹 페이지의 작성에 이용되는 마크업 언어의 태그 정보를 분석하고 이에 기초하여 여러가지 서로 다른 범위의 텍스트를 추출하는 것에 그 목적이 있다.Meanwhile, an object of the present invention is to analyze tag information of a markup language used for creating a web page and to extract text of various different ranges based on this.

또한, 본 발명은 웹 페이지로부터 텍스트를 추출함에 있어, 사용자가 마우스 드래깅 등 번거로운 사용자 조작을 수행할 필요나, 웹 페이지가 특별히 수정되도록 할 필요가 없도록 하는 데에 그 목적이 있다.In addition, the present invention is to extract the text from the web page, so that the user does not need to perform a cumbersome user operation, such as mouse dragging, it is not necessary to specifically modify the web page.

그리고, 본 발명은 웹 페이지의 사용자가 추출된 텍스트로부터 변환된 음성 데이터를 편리하게 획득할 수 있도록 하는 데에 그 목적이 있다.Another object of the present invention is to enable a user of a web page to conveniently obtain voice data converted from extracted text.

상기 목적을 달성하기 위한 본 발명의 구성은 다음과 같다.The configuration of the present invention for achieving the above object is as follows.

본 발명의 일 태양에 따르면, 웹 페이지의 작성에 이용된 마크업 언어의 태그 정보에 기초하여 텍스트를 추출하는 방법으로서, 웹 페이지 상의 텍스트 포인터를 인식하는 단계, 상기 웹 페이지 내에서 상기 텍스트 포인터에 의해 포인팅되는 텍스트를 둘러싸는 적어도 하나의 태그 쌍이 소정 텍스트의 추출을 위한 태그 쌍인 지를 판별하는 단계, 및 상기 판별 단계에서 판별된 태그 쌍이 둘러싸는 텍스트를 추출하는 단계를 포함하는 방법이 제공된다.According to an aspect of the present invention, there is provided a method of extracting text based on tag information of markup language used to create a web page, the method comprising: recognizing a text pointer on a web page, the text pointer in the web page; A method is provided for determining whether at least one tag pair surrounding the text pointed to is a tag pair for extraction of predetermined text, and extracting the text surrounded by the tag pair determined in the determining step.

본 발명의 다른 태양에 따르면, 웹 페이지의 작성에 이용된 마크업 언어의 태그 정보에 기초하여 텍스트를 추출하는 시스템으로서, 웹 페이지 상의 텍스트 포인터를 인식하는 텍스트 포인터 인식부, 상기 웹 페이지 내에서 상기 텍스트 포인터에 의해 포인팅되는 텍스트를 둘러싸는 적어도 하나의 태그 쌍이 소정 텍스트의 추출을 위한 태그 쌍인지를 판별하는 태그 쌍 판별부, 및 상기 태그 쌍 판별부에서 판별된 태그 쌍이 둘러싸는 텍스트를 추출하는 텍스트 추출부를 포함하는 시스템이 제공된다.According to another aspect of the present invention, there is provided a system for extracting text based on tag information of a markup language used for creating a web page, the system comprising: a text pointer recognition unit for recognizing a text pointer on a web page; A tag pair determination unit for determining whether at least one tag pair surrounding the text pointed by the text pointer is a tag pair for extracting predetermined text, and text for extracting text surrounded by the tag pair determined by the tag pair determination unit A system is provided that includes an extraction unit.

이 외에도, 본 발명에 따르면 웹 페이지의 텍스트를 추출하고/추출하거나 추출된 텍스트에 기초하여 TTS 서비스를 제공하기 위한 다른 방법, 시스템, 및 상기 방법들을 실행하기 위한 컴퓨터 프로그램을 기록하는 컴퓨터 판독 가능한 기록 매체가 더 제공된다.In addition, according to the present invention, a computer-readable record for extracting text of a web page and / or recording another method, system for providing a TTS service based on the extracted text, and a computer program for executing the methods. Medium is further provided.

본 발명에 따르면, 사용자의 의사에 부합하게끔 웹 페이지 내의 텍스트를 추출하고, 이를 바탕으로 TTS 서비스를 제공할 수 있다.According to the present invention, it is possible to extract the text in the web page in accordance with the user's intention, and provide a TTS service based on this.

또한, 본 발명에 따르면, 웹 페이지의 특성에 맞게 텍스트를 자동적으로 추출하여 TTS 서비스를 제공함으로써, 사용자로 하여금 웹 페이지의 정보를 손쉽게 얻을 수 있도록 하고, 사용자의 웹 페이지로의 접근성을 높일 수 있다.In addition, according to the present invention, by automatically extracting the text according to the characteristics of the web page to provide a TTS service, it is possible for the user to easily obtain the information of the web page, and improve the user's access to the web page. .

한편, 본 발명에 따르면, 구태여 웹 페이지의 수정을 하지 않고도 웹 페이지 상의 텍스트를 추출하고, 이에 기초하여 TTS 서비스를 제공할 수 있게 된다.On the other hand, according to the present invention, it is possible to extract the text on the web page without modifying the old web page, and to provide a TTS service based on this.

이하에서는, 첨부되는 도면을 참조하여 본 발명의 다양한 실시예들을 상세하게 설명하기로 한다.Hereinafter, various embodiments of the present invention will be described in detail with reference to the accompanying drawings.

전체 시스템의 구성Configuration of the entire system

도 1은 본 발명의 일 실시예에 따른 TTS 서비스 제공 시스템의 개략적인 구성을 나타낸 도면이다.1 is a view showing a schematic configuration of a TTS service providing system according to an embodiment of the present invention.

도 1에 도시된 바와 같이, 본 발명의 일 실시예에 따른 TTS 서비스 제공 시스템은 사용자 컴퓨터(100) 및 TTS 서버(300)를 포함할 수 있다. 여기서, 사용자 컴퓨터(100)와 TTS 서버(300)는 전용회선을 이용하는 근거리 통신망(LAN) 또는 원거리 통신망(WAN) 등의 다양한 네트워크 환경을 통해 통신할 수 있다. 이러한 네트워크 환경은 공지의 월드 와이드 웹(World Wide Web; WWW)일 수 있다.As shown in FIG. 1, the TTS service providing system according to an exemplary embodiment of the present invention may include a user computer 100 and a TTS server 300. Here, the user computer 100 and the TTS server 300 may communicate through various network environments such as a local area network (LAN) or a long-distance network (WAN) using a dedicated line. This network environment may be the known World Wide Web (WWW).

한편, TTS 서버(300)는 TTS 서비스(즉, 예를 들면, 웹 페이지에 포함된 텍스트를 음성으로 변환해 주는 서비스)를 제공하기 위한 서버로서, TTS 서버(300) 내의 TTS 변환부(310)가 텍스트를 음성으로 변환하는 처리를 수행할 수 있다. 이러한 TTS 서버(300)는 인터넷 포털 사이트의 웹 서버일 수도 있고, TTS 서비스만을 전문적으로 제공하는 업체의 운영 서버일 수도 있다. TTS 서버(300)는 공지의 네트워크 환경에서 인터넷 프로토콜을 통해 하나 이상의 사용자 컴퓨터(100)와 쌍방향 통신을 할 수 있다. 또한, 이러한 TTS 서버(300)는 사용자 컴퓨터(100)로부터의 요청에 따라 음성 변환 데이터베이스(500)를 참조하여 필요한 처리를 수행할 수 있다.Meanwhile, the TTS server 300 is a server for providing a TTS service (ie, a service for converting text included in a web page into voice), and the TTS conversion unit 310 in the TTS server 300. Can perform the process of converting the text to speech. The TTS server 300 may be a web server of an Internet portal site, or may be an operation server of a company that specializes in providing only TTS services. The TTS server 300 may interactively communicate with one or more user computers 100 through an Internet protocol in a known network environment. In addition, the TTS server 300 may perform the necessary processing by referring to the speech conversion database 500 in response to a request from the user computer 100.

음성 변환 데이터베이스(500)에는 특정 텍스트에 대응하는 음성 데이터에 대한 정보가 저장되어 있다.The voice conversion database 500 stores information on voice data corresponding to a specific text.

사용자 컴퓨터의 구성Configuration of Your Computer

도 2는 도 1의 TTS 서비스 제공 시스템 중 사용자 컴퓨터(100)의 상세 구성을 나타낸 도면이다.FIG. 2 is a diagram illustrating a detailed configuration of a user computer 100 in the TTS service providing system of FIG. 1.

도 2에 도시되는 바와 같이, 사용자 컴퓨터(100)는 연산부(110), 태그 정보 데이터베이스(130), 프로그램 저장부(150), 사용자 입력부(170) 및 출력부(190)를 포함할 수 있다.As shown in FIG. 2, the user computer 100 may include an operation unit 110, a tag information database 130, a program storage unit 150, a user input unit 170, and an output unit 190.

연산부(110)는 마우스오버 인식부(111), 태그 쌍 판별부(113), 텍스트 추출부(115), 텍스트 전송부(117) 및 음성 데이터 제공부(119)를 포함할 수 있다. 본 발명의 일 실시예에 따르면, 마우스오버 인식부(111), 태그 쌍 판별부(113), 텍스트 추출부(115), 텍스트 전송부(117) 및 음성 데이터 제공부(119)는 그 중 적어도 일부가 연산부(110)에 포함되거나 연산부(110)와 통신하는 프로그램 모듈들일 수 있다. 이러한 프로그램 모듈들은 운영 시스템, 응용 프로그램 모듈 및 기타 프로그램 모듈의 형태로 연산부(110)에 포함될 수 있으며, 물리적으로는 여러가지 공지의 기억 장치 상에 저장될 수 있다. 또한, 이러한 프로그램 모듈들은 연산부(110)와 통신 가능한 원격 기억 장치에 저장될 수도 있다. 한편, 이러한 프로그램 모듈들은 본 발명에 따라 후술할 특정 업무를 수행하거나 특정 추상 데이터 유형을 실행하는 루틴, 서브루틴, 프로그램, 오브젝트, 컴포넌트, 데이터 구조 등을 포괄하 지만, 이에 제한되지는 않는다.The operation unit 110 may include a mouseover recognition unit 111, a tag pair determination unit 113, a text extraction unit 115, a text transmission unit 117, and a voice data providing unit 119. According to an exemplary embodiment of the present invention, the mouseover recognition unit 111, the tag pair determination unit 113, the text extraction unit 115, the text transmission unit 117, and the voice data providing unit 119 are at least one of them. Some may be included in the calculator 110 or program modules communicating with the calculator 110. Such program modules may be included in the operation unit 110 in the form of an operating system, an application program module, and other program modules, and may be physically stored on various known storage devices. In addition, these program modules may be stored in a remote storage device that can communicate with the operation unit 110. Meanwhile, such program modules include, but are not limited to, routines, subroutines, programs, objects, components, data structures, etc. that perform particular tasks or execute particular abstract data types, which will be described later, according to the present invention.

태그 정보 데이터베이스(130)에는 이하에서 상세히 설명되는 바와 같이, 소정의 식별자(예를 들면, URL)을 갖는 웹 페이지의 작성에 이용된 마크업 언어 중, 마우스 포인터에 의해 포인팅된 텍스트를 둘러싸는 태그 쌍이 소정 텍스트의 추출을 위한 태그 쌍인지 여부에 대한 정보가 저장되어 있으며, 연산부(110)는 필요에 따라 태그 정보 데이터베이스(130)를 참조할 수 있다. 도 2에는 연산부(110)와 태그 정보 데이터베이스(130)가 별도의 구성요소인 것으로 도시되어 있으나, 태그 정보 데이터베이스(130)는 연산부(110) 내에 일 구성요소로서 포함될 수도 있다. 또한, 본 발명에 따른 태그 정보 데이터베이스(130)는 그것이 본 발명의 개별 실시예를 실현하기 위해 사용되는 것인 이상, 반드시 사용자 컴퓨터(100)가 아닌 외부 장치, 예를 들면, TTS 서버(300)에 포함될 수도 있다. 다만, 이러한 경우 연산부(110)에서 텍스트를 추출할 때에 일일이 TTS 서버(300)를 참조하여야 하므로, 효율적인 연산의 필요에 따라 도 2에 도시된 바와 같은 구성이 더 바람직한 것은 사실이다.The tag information database 130 includes a tag surrounding a text pointed by a mouse pointer among markup languages used for creating a web page having a predetermined identifier (for example, a URL) as described in detail below. Information on whether the pair is a tag pair for extracting a predetermined text is stored, and the operation unit 110 may refer to the tag information database 130 as necessary. Although the calculator 110 and the tag information database 130 are illustrated as separate components in FIG. 2, the tag information database 130 may be included as one component in the calculator 110. In addition, the tag information database 130 according to the present invention, as long as it is used to realize individual embodiments of the present invention, is not necessarily a user computer 100, for example, an external device such as the TTS server 300. It may also be included. However, in this case, since the TTS server 300 should be referred to when extracting text from the calculation unit 110, it is true that the configuration as shown in FIG. 2 is more preferable according to the need of efficient calculation.

또한, 연산부(110)는 프로그램 구동부(미도시)를 추가로 포함하여 프로그램 저장부(150)에 저장되어 있는 프로그램, 즉, 본 발명에 따라 텍스트를 추출하고 TTS 서비스를 제공하기 위한 프로그램이 웹 브라우저 실행 시에 함께 구동되도록 할 수도 있다. 프로그램 저장부(150)는 반드시 사용자 컴퓨터(100)의 일 구성요소로서 포함될 필요는 없으며, 컴퓨터로 판독이 가능한 공지의 기록 매체, 즉, 하드 디스크, 플로피 디스크, 플롭티컬 디스크, 자기 테이프, CD-ROM, DVD 등의 기록 매 체로 대체될 수도 있다.In addition, the operation unit 110 further includes a program driver (not shown), which is a program stored in the program storage unit 150, that is, a program for extracting text and providing a TTS service according to the present invention. It can also be driven together at run time. The program storage unit 150 does not necessarily need to be included as a component of the user computer 100, but may be a computer-readable recording medium, that is, a hard disk, a floppy disk, a floppy disk, a magnetic tape, a CD-. It may be replaced by a recording medium such as a ROM or a DVD.

사용자 입력부(170)는 통상의 컴퓨터 입력 수단, 즉, 키보드 또는 마우스 등일 수 있으며, 출력부(190)는 웹 브라우저 표시 및/또는 웹 페이지 표시를 시각적으로 나타내기 위한 컴퓨터 모니터나 텍스트를 음성으로 출력할 수 있는 스피커 등으로 구현될 수 있다.The user input unit 170 may be a conventional computer input means, that is, a keyboard or a mouse, and the like, and the output unit 190 outputs a computer monitor or text for voice to visually display a web browser display and / or a web page display. It can be implemented as a speaker that can.

참고로, 도 1 및 도 2의 각 구성요소는 서로 필요에 따라 통신할 수 있지만, 상기 통신을 위한 공지의 통신 수단에 대해서는 구체적으로 도시하지 않기로 한다.For reference, each component of FIGS. 1 and 2 may communicate with each other as needed, but will not be specifically illustrated for known communication means for the communication.

텍스트 추출 및 음성 변환Text Extraction and Speech Conversion

이하, 도 1 내지 도 3을 참조하여, 본 발명의 일 실시예에 따라, 웹 페이지로부터 텍스트를 추출하는 과정 및 추출된 텍스트를 음성으로 변환하여 출력하는 과정을 살펴보기로 한다. 도 3은 본 발명의 일 실시예에 따라 웹 페이지에 포함되는 텍스트 상에 마우스오버를 할 때에, 소정 영역의 텍스트가 추출되어 음성으로 변환된 후 출력되는 과정을 나타내는 흐름도이다.Hereinafter, referring to FIGS. 1 to 3, a process of extracting text from a web page and a process of converting the extracted text into voice and outputting the same will be described. 3 is a flowchart illustrating a process of extracting a text of a predetermined region, converting the text into a voice, and then outputting it when a mouseover is performed on text included in a web page according to an exemplary embodiment of the present invention.

사용자가 사용자 컴퓨터(100)를 이용하여 웹 브라우저를 실행시키면 본 발명의 일 실시예에 따라 텍스트를 추출하고 추출된 텍스트를 음성으로 변환하여 출력하기 위한 프로그램이 함께 구동된다. 이 프로그램은, 전술한 바와 같이, 사용자 컴퓨터(100) 내의 프로그램 저장부(150)에 기록되어 있을 수도 있고, 별도의 기록 매체에 기록되어 있을 수도 있다.When a user executes a web browser using the user computer 100, a program for extracting text, converting the extracted text into voice, and outputting the same is driven according to an embodiment of the present invention. As described above, this program may be recorded in the program storage unit 150 in the user computer 100, or may be recorded in a separate recording medium.

이후, 사용자는 인터넷에 접속할 수 있고, 기동된 웹 브라우저를 통해 소정의 URL을 갖는 웹 페이지를 방문할 수 있다. 한편, 수많은 서버들이 웹 브라우저 를 통해 열람 가능한 컨텐츠를 제공하게 되는데, 이들의 위치를 표시하기 위하여 통상적으로 URL이 사용된다. 이러한 URL은 인터넷 상의 각 서버들에 있는 파일들의 위치를 명시하기 위한 것이지만, URL은 비교적 자유롭게 정해질 수 있는 속성을 가지므로, 웹 페이지의 특성을 나타내기 위한 다른 정보(예를 들면, 본 발명의 일 실시예에 따른 소정 텍스트의 추출을 위한 태그 쌍에 관한 정보)도 역시 포함할 수 있다. 어느 경우에나, URL 또는 URL의 일부가 본 발명에 따른 소정 텍스트의 추출을 위한 태그 쌍에 관한 정보와 대응될 수 있다.Thereafter, the user can connect to the Internet and visit a web page having a predetermined URL through an activated web browser. On the other hand, a number of servers provide content that can be viewed through a web browser, the URL is usually used to indicate their location. This URL is intended to specify the location of the files on each of the servers on the Internet, but since the URL has a relatively freely definable property, other information may be used to characterize the web page (e.g., Information about a tag pair for extracting a predetermined text according to an embodiment) may also be included. In either case, the URL or part of the URL may correspond to information about tag pairs for extraction of certain text according to the present invention.

이러한 웹 페이지는 통상적으로 마크업 언어[예를 들면, HTML(HyperText Markup Language), XML(eXtensible Markup Language), XHTML(eXtensible HyperText Markup Language) 등]로 작성된다. 마크업 언어는 웹 페이지를 작성하기 위하여 사용되는 프로그래밍 언어로서, 이 중에서 특히 널리 사용되고 있는 HTML은 웹 문서 상의 하이퍼텍스트를 작성하기 위해 개발된 언어이다. 마크업 언어는 웹 페이지 상의 글자 크기, 글자 색, 글자 모양, 그래픽, 문서 이동(하이퍼링크) 등을 정의하는 명령어를 포함한다. 마크업 언어에서 사용되는 명령어를 통칭하여 태그라고 하는데, 태그는 시작과 끝을 표시하는 2개의 태그가 하나의 쌍을 이루는 특징을 갖는다. 본 명세서에서는 이를 태그 쌍이라고 칭하기로 한다.Such web pages are typically written in a markup language (eg, HyperText Markup Language (HTML), eXtensible Markup Language (XML), eXtensible HyperText Markup Language (XHTML), etc.). The markup language is a programming language used to create web pages. Among them, HTML, which is widely used, is a language developed for writing hypertext on a web document. The markup language includes instructions for defining font size, font color, font shape, graphics, document movement (hyperlink), and the like on a web page. Commands used in the markup language are collectively called tags, which are characterized by a pair of two tags that mark the beginning and end. In this specification, this is referred to as a tag pair.

현재 널리 유포되고 있는, 셀 수 없이 많은 종류의 웹 페이지들은 이러한 마크업 언어로 작성된 웹 문서에 의해 구현되고, 웹 브라우저는 이를 해석하여 사용자가 시각적으로 편하게 경험할 수 있도록 하여 준다. 즉, 이러한 마크업 언어로 작성된 웹 문서와 이에 포함되는 태그 쌍에 의해, 사용자는 다양한 구성의 웹 페이 지를 볼 수 있게 되는 것이다.Countless kinds of web pages, which are now widely distributed, are implemented by web documents written in such markup language, and the web browser interprets them to provide a visually comfortable user experience. That is, a web document written in such markup language and a tag pair included in the markup language allow a user to view web pages of various configurations.

도 3을 참조하여, 본 발명의 일 실시예에 따라 웹 페이지로부터 텍스트를 추출하고 이를 음성 변환한 데이터를 출력하는 과정에 대하여 살펴보기로 한다.Referring to FIG. 3, a process of extracting text from a web page and outputting data obtained by voice conversion will be described with reference to FIG. 3.

먼저, 사용자가 사용자 컴퓨터(100)의 웹 브라우저에 의해 디스플레이되는, 웹 페이지에 포함된 텍스트 상에 마우스 포인터를 위치시키면, 단계 S310에서, 연산부(110)의 마우스오버 인식부(111)는 마우스오버 이벤트가 발생하였는지 여부를 파악한다. 일반적으로, 사용자가 본 발명의 일 실시예에 따른 TTS 서비스를 제공 받기를 원할 때에는 해당 텍스트 위에 마우스를 일정 시간 이상 지속적으로 위치시키게 된다. 따라서, 마우스오버 인식부(111)는 마우스 포인터가 임의의 텍스트 상에서 일정 시간 이상 정지하는 경우, 마우스오버 이벤트가 발생한 것으로 인식할 수 있다.First, when the user places the mouse pointer on the text included in the web page, which is displayed by the web browser of the user computer 100, in step S310, the mouseover recognition unit 111 of the operation unit 110 performs the mouseover. Determine whether an event has occurred. In general, when the user wants to be provided with the TTS service according to an embodiment of the present invention, the mouse is continuously positioned for a predetermined time or more on the corresponding text. Accordingly, the mouseover recognition unit 111 may recognize that the mouseover event has occurred when the mouse pointer is stopped for a predetermined time or more on arbitrary text.

단계 S330에서는, 태그 쌍 판별부(113)가 현재의 마우스오버된 텍스트를 둘러싸는 태그 쌍이 텍스트 추출을 위한 최종적인 태그 쌍인지 여부를 판별하고, 이러한 판별 결과에 기초하여 최종적인 태그 쌍을 검출해낸다. 이에 관하여, 도 4를 참조하여 더 자세히 살펴보면 다음과 같다.In step S330, the tag pair determination unit 113 determines whether the tag pair surrounding the current mouseover text is the final tag pair for text extraction, and detects the final tag pair based on the determination result. . In this regard, it will be described in more detail with reference to FIG. 4 as follows.

태그 쌍 판별부(113)는 일단 포인팅된 텍스트에 가장 근접한 최하위 태그 쌍을 인식하여(단계 S410), 이 태그 쌍이 웹 페이지의 해당 위치에서 최종적으로 추출될 텍스트의 범위를 결정 짓는 태그 쌍인지 여부를 판별한다(단계 S430). 이때, 태그 쌍 판별부(113)는 태그 정보 데이터베이스(130)에 저장되어 있는 정보, 즉, 웹 페이지의 소정 태그 쌍이 텍스트 추출을 지시하는 태그 쌍인지 여부에 대한 정 보를 참조할 수 있다. 만약, 태그 정보 데이터베이스(130)에서 상기 최하위 태그 쌍이 텍스트 추출을 위한 태그 쌍이 아닌 것으로 지시되어 있으면, 태그 쌍 판별부(113)는 상기 최하위 태그 쌍을 둘러싸는 한 단계 상위의 태그 쌍에 대해 이러한 판별 처리를 반복하며(단계 S431), 이는 텍스트 추출을 위한 태그 쌍이 검출될 때까지 지속된다. 즉, 포인팅된 텍스트를 둘러싸는 최하위 태그 쌍으로부터 최상위 태그 쌍까지 순차적으로 해당 태그 쌍이 텍스트 추출을 위한 태그 쌍인지 여부에 대한 판별을 계속하며, 텍스트 추출을 위한 태그 쌍이 한 쌍이라도 검출되면 이러한 과정을 중지한다(단계 S433). 반대로, 상기 최하위 태그 쌍이 텍스트 추출을 위한 태그 쌍이면, 태그 쌍 판별부(113)는 더 이상의 판별 처리를 중지한다(단계 S433).The tag pair determination unit 113 recognizes the lowest tag pair that is closest to the pointed text once (step S410), and determines whether the tag pair is a tag pair that determines the range of text to be finally extracted from the corresponding position of the web page. Determine (step S430). In this case, the tag pair determination unit 113 may refer to information stored in the tag information database 130, that is, whether a predetermined tag pair of a web page is a tag pair indicating text extraction. If the lowest tag pair is indicated in the tag information database 130 as being not a tag pair for text extraction, the tag pair discrimination unit 113 determines such a tag pair for the upper one step surrounding the lowest tag pair. The process is repeated (step S431), and this continues until a tag pair for text extraction is detected. That is, the determination continues from the lowest tag pair surrounding the pointing text to the highest tag pair in order to determine whether the tag pair is a tag pair for text extraction, and if such a pair of tag pairs for text extraction is detected, the process is repeated. It stops (step S433). On the contrary, if the lowest tag pair is a tag pair for text extraction, the tag pair discrimination unit 113 stops further discrimination processing (step S433).

예를 들어, 「<p>왜 노벨상이 없을까요?</p>」라는, 마크업 언어로 작성된 텍스트가 있고, 마우스오버 이벤트 발생 시에 마우스 포인터가 상기 텍스트 중 '상'의 위치에 있다면, 태그 쌍 판별부(113)는 마우스 포인터에 의해 포인팅되는 텍스트, 즉, '상'을 둘러싸는 최하위 태그 쌍인 <p>, </p> 태그 쌍이 텍스트 추출을 위한 최종적인 태그 쌍인지 여부를 판별하게 된다. 이 경우, 태그 정보 데이터베이스(130)를 참조할 때에, <p>, </p>라는 태그 쌍이 텍스트 추출을 위한 태그 쌍이라는 정보가 확인된다면 더 이상의 판별 처리를 중지한다. 반대로, <p>, </p>라는 태그 쌍이 텍스트 추출을 위한 태그 쌍이 아니라면, 한 단계 상위의 태그 쌍에 대해 상기 판별 처리를 반복한다.For example, if you have text written in the markup language called "<p> Why is there no Nobel prize? </ P>", and the mouse pointer is at the "upper" position in the text when the mouseover event occurs, the tag The pair discrimination unit 113 determines whether the text pointed by the mouse pointer, that is, the <p> and </ p> tag pairs that are the lowest tag pairs surrounding the 'up' is the final tag pair for text extraction. . In this case, when referring to the tag information database 130, if it is confirmed that the tag pairs <p> and </ p> are tag pairs for text extraction, further determination processing is stopped. On the contrary, if the tag pairs <p> and </ p> are not tag pairs for text extraction, the determination process is repeated for the tag pairs up one step.

단계 S350에서는, 단계 S330에서 태그 쌍 판별부(113)에 의해 판별된 태그 쌍이 둘러싸는 텍스트를 추출한다. 이 단계는 텍스트 추출부(115)에 의해 수행된다. 전술하였던 예를 참조하면, 태그 쌍 판별부(113)에 의해 만약 <p>, </p>라는 태그 쌍이 최종적인 텍스트 추출을 위한 태그 쌍으로 판별되었다면, 상기 태그 쌍이 둘러싸는 텍스트인 '왜 노벨상이 없을까요?'라는 텍스트가 추출된다.In step S350, the text surrounded by the tag pair determined by the tag pair determination unit 113 in step S330 is extracted. This step is performed by the text extraction unit 115. Referring to the above example, if the tag pair determination unit 113 determines that the tag pairs <p> and </ p> are the tag pairs for the final text extraction, the 'Well Nobel Prize' is the text surrounded by the tag pairs. Is not there? '

단계 S370에서는, 텍스트 전송부(117)가 단계 S350에서 추출된 최종적인 텍스트를 TTS 서버(300)로 전송한다. TTS 서버(300)의 TTS 변환부(310)는 텍스트를 음성으로 변환하는 데에 필요한 정보를 저장하고 있는 음성 변환 데이터베이스(500)를 참조하여 전송 받은 텍스트를 음성으로 변환시키고 이를 사용자 컴퓨터(100)로 재전송한다. 음성 변환 데이터베이스(500)에는 텍스트 코드별로 대응되는 음성 데이터가 저장되어 있을 수도 있고, 실제 텍스트의 세부 음절별로 대응되는 음성 데이터가 저장되어 있을 수도 있다.In step S370, the text transmitter 117 transmits the final text extracted in step S350 to the TTS server 300. The TTS conversion unit 310 of the TTS server 300 converts the received text into speech by referring to the speech conversion database 500 that stores information necessary for converting text into speech, and then converts the received text into speech. Resend to. The voice data corresponding to each text code may be stored in the voice conversion database 500, or voice data corresponding to the detailed syllables of the actual text may be stored.

단계 S380에서는, 사용자 컴퓨터(100)가 TTS 서버(300)로부터 전송되는 음성 데이터를 수신하고, 즉, TTS 서버(300)의 TTS 변환부(310)에서 텍스트로부터 음성으로 변환된 데이터를 수신하고, 단계 S390에서는, 수신된 상기 음성 데이터가 연산부(110)의 음성 데이터 제공부(119)에 의해 제공된다. 해당 음성 데이터는 스피커 등의 출력부(190)에 의해 사용자에게 출력될 수 있다.In step S380, the user computer 100 receives voice data transmitted from the TTS server 300, that is, the TTS conversion unit 310 of the TTS server 300 receives data converted from text into voice, In operation S390, the received voice data is provided by the voice data provider 119 of the calculator 110. The voice data may be output to the user by an output unit 190 such as a speaker.

아래에서는, 이상 살펴본 바와 같은, 본 발명의 일 실시예에 따른 TTS 서비스 제공 시스템의 주요 구성요소 중 하나인 태그 정보 데이터베이스(130)의 구성을 참조하여, 본 발명의 일 실시예에 따라 태그 정보를 분석하여 텍스트 추출 여부를 결정하는 방법에 관하여 보다 상세하게 살펴보기로 한다.Hereinafter, with reference to the configuration of the tag information database 130, which is one of the main components of the TTS service providing system according to an embodiment of the present invention, as described above, the tag information according to an embodiment of the present invention Let's take a closer look at how to analyze and determine whether to extract text.

태그 정보 데이터베이스Tag information database

이미 잘 알려져 있는 바와 같이, 사용자가 사용자 컴퓨터(100)를 이용하여 웹 브라우저를 실행시키고 웹 사이트에 접속하여 이용할 수 있는 웹 페이지들은, 그 기초가 되는 마크업 언어의 기술(description) 및 사용된 태그에 따라 다양한 외형을 가지게 된다. 예를 들어, 포털 사이트의 메인 페이지는 많은 영역(예를 들면, 타이틀 영역, 배너 영역, 동영상 영역 등)으로 분리되어 있고, 각 영역은 복수의 짤막한 텍스트들을 포함하며, 각각의 짤막한 텍스트에는 다른 웹 페이지와의 연결을 위한 하이퍼링크가 포함될 수 있다. 또한, 백과 사전과 같은 웹 페이지는 검색어를 디스플레이하는 영역과 상기 검색어에 대한 설명 영역이 나뉘어져 있을 수 있다. 한편, 온라인 게시판을 나타내는 웹 페이지는 게시판의 분류를 나타내는 영역과 게시판의 내용을 나타내는 영역으로 나뉘고, 게시판의 내용을 나타내는 영역은 다시 게시물의 제목별로 복수 개의 영역으로 나뉘어질 수 있다.As is well known, web pages that a user can use a user's computer 100 to run a web browser and access a web site are available, including the description of the underlying markup language and the tags used. Depending on the appearance will have a variety. For example, the main page of a portal site is divided into many areas (eg, title area, banner area, video area, etc.), each area containing a plurality of short texts, each short text having a different web. Hyperlinks can be included to link to pages. In addition, in a web page such as an encyclopedia, an area for displaying a search word and a description area for the search word may be divided. Meanwhile, a web page representing an online bulletin board may be divided into an area indicating a bulletin board classification and an area indicating a bulletin board content, and an area indicating the bulletin board content may be divided into a plurality of areas for each title of the bulletin board.

이렇게 시각적으로 나뉘어진 영역들은 서로 연관성 없는 내용의 텍스트를 담고 있는 경우가 대부분이며, 사용자는 텍스트 기반 서비스를 제공 받는 데에 있어서, 연관성 없는 내용의 텍스트는 가급적이면 서로 독립적으로 추출하고 연관성 높은 텍스트는 하나로 묶어서 추출한 후, 추출된 텍스트를 음성으로 변환하는 것을 선호하기 마련이다. 한편, 사용자는 문장 또는 단락 등 언어적으로 구분되어 있는 텍스트들이라도 서로 연관성 있는 내용을 다루고 있으면 이들에 기초한 텍스트 기반 서비스를 한꺼번에 제공 받고자 할 수 있다. 그 밖에, 사용자는 웹 페이지 내에서 시각적으로 구분되어 있는 영역의 텍스트들은 각각 별도로 추출해내어 이에 기초한 서비스를 받고자 할 수 있다.Most of these visually divided areas contain texts that are not related to each other, and users are provided with text-based services. In this case, unrelated texts are extracted independently from each other. I prefer to bundle them together and then convert the extracted text into voice. On the other hand, a user may want to be provided with a text-based service based on texts that are related to each other even if the texts that are language-divided such as sentences or paragraphs are related to each other. In addition, the user may want to extract the texts of the visually separated areas within the web page separately and receive a service based on the texts.

소정의 마크업 언어로 작성된 웹 페이지 내에서 시각적인 영역의 구분은 주로 태그의 사용에 의해 가능하게 된다. 따라서, 텍스트를 추출하는 데에 있어서, 특정 태그 쌍 내의 범위에서만 추출한다면, 시각적으로 분리되어 있는 웹 페이지의 각 영역 내의 텍스트를 개별적으로 추출해낼 수 있다. 또한, 문장 또는 단락을 구분하는 태그 쌍이 있다고 하더라도 그 태그 쌍이 해당 영역에서 소정 텍스트를 추출하기 위한 태그 쌍이 아닌 것으로 정해 둔다면, 상기 태그 쌍에 의한 구분은 무시되고 그보다 넓은 영역의 텍스트가 모두 추출될 수 있다.The division of the visual area within a web page written in a given markup language is made possible mainly by the use of tags. Thus, in extracting text, if it is extracted only within a range within a specific tag pair, the text in each area of the visually separated web page can be extracted separately. In addition, even if there is a tag pair that separates sentences or paragraphs, if the tag pair is not a tag pair for extracting a predetermined text from the corresponding area, the classification by the tag pair is ignored and all the text in the wider area can be extracted. have.

본 발명의 일 실시예에 따르면, 태그 정보 데이터베이스(130)에는 특정 태그 쌍이 소정 텍스트의 추출을 위한 태그 쌍인지에 대한 정보가 저장될 수 있다. 예를 들어 <p>, </p>라는 태그 쌍이 소정 텍스트의 추출을 위한 태그 쌍이 아니고, 그보다 한 단계 상위의 태그 쌍인 <div>, </div>라는 태그 쌍만이 소정 텍스트의 추출을 위한 태그 쌍이라는 정보가 태그 정보 데이터베이스(130)에 저장되어 있다면, 예를 들어,According to an embodiment of the present invention, the tag information database 130 may store information on whether a specific tag pair is a tag pair for extracting a predetermined text. For example, tag pairs <p> and </ p> are not tag pairs for extracting predetermined text, and only tag pairs <div> and </ div> that are one level higher than that are tag pairs for extracting predetermined text. For example, if the information of the pair is stored in the tag information database 130,

<P>수학은 과학에게도 밀접하게 관련이 있으며 많은 학문에서 필요로 하는 중요한 학문인데</P>Mathematics is closely related to science and is an important discipline that many disciplines need.

<P>왜 노벨상이 없을까요?</P>Why are there no Nobel prizes?

</div></ div>

라는 마크업 언어로 작성된 웹 페이지에서는, 태그 쌍 판별부(113)에 의해 <p>, </p>라는 최하위 태그 쌍보다 한 단계 상위인 <div>, </div>라는 태그 쌍이 텍스트 추출을 위한 태그 쌍으로 판별되고, 텍스트 추출부(115)에 의해 상기 <div>, </div>라는 태그 쌍으로 둘러싸인 텍스트가 추출될 수 있을 것이다.In a web page written in a markup language, the tag pair discrimination unit 113 extracts text from the tag pairs <div> and </ div> that are one level higher than the lowest tag pairs <p> and </ p>. And the text surrounded by the tag pairs <div> and </ div> may be extracted by the text extractor 115.

한편, 어떠한 태그 쌍이 소정 텍스트의 추출을 위한 태그 쌍인지 여부에 대한 정보는 필요에 따라 웹 페이지의 특성에 따라 달라질 수 있으며, 이에 따라 태그 정보 데이터베이스(130)에는 웹 페이지별로 또는 웹 페이지의 특성별로 서로 다른 정보가 저장될 수 있다. 예를 들어, 뉴스 기사를 담고 있는 웹 페이지에 대하여는, 뉴스 기사의 제목을 포함한 기사 전문을 한꺼번에 추출하여 음성 변환 또는 번역을 수행하는 것이 사용자의 욕구에 부합할 수 있으므로, 모든 텍스트를 포함하는 가장 상위의 태그 쌍만이 텍스트 추출을 위한 태그 쌍으로 지정되어 저장될 수 있다. 한편, 백과사전 웹 페이지의 경우에는, 검색어와 그에 대한 설명 부분이 시각적으로 다른 영역으로 분리되어 있을 수 있지만, 검색어와 그에 따른 설명 부분만을 일단 추출하도록 소정 텍스트의 추출을 위한 태그 쌍을 지정할 수 있을 것이다.Meanwhile, information on which tag pair is a tag pair for extracting a predetermined text may vary according to the characteristics of the web page as necessary. Accordingly, the tag information database 130 may be configured for each web page or for each characteristic of the web page. Different information can be stored. For example, for a web page that contains news articles, extracting the full text including the title of the news article at once and performing a voice conversion or translation may be in line with the user's needs, so that the top page containing all text Only tag pairs of may be designated and stored as tag pairs for text extraction. Meanwhile, in the case of an encyclopedia web page, a search word and a description part thereof may be visually separated into different areas, but a tag pair for extracting a predetermined text may be specified to extract only the search word and the description part thereof. will be.

본 출원인의 저명 포털 사이트인 네이버^TM의 경우, 각 웹 페이지가 다양한 태그 쌍에 의해 기술되고 있는데, 이 중 몇 가지 웹 페이지의 종류에 따라 소정 텍스트의 추출을 위한 태그 쌍으로 사용될 수 있는 태그 쌍과 그렇지 않은 태그 상을 나누어 보면 다음의 표에서와 같다:In the case of Naver ^TM , the well-known portal site of the applicant, each web page is described by various tag pairs, among which, a pair of tags which can be used as a tag pair for extracting a predetermined text according to several types of web pages and Breaking down the tag phases that do not are as shown in the following table:

이렇듯 태그 정보 데이터베이스(130)에 저장되는 소정 텍스트의 추출을 위한 태그 쌍에 대한 정보는, 필요에 따라 또는 사용자의 기호 변화에 따라 변경될 수 있음은 물론이다.As such, the information on the tag pair for extracting the predetermined text stored in the tag information database 130 may be changed as necessary or in accordance with a change in user's preference.

태그 정보를 이용한 텍스트 추출의 예Example of Extracting Text Using Tag Information

이하에서는, 본 발명의 일 실시예에 따른 태그 정보 데이터베이스(130)에 저장되는 태그 정보를 이용하여 텍스트 추출을 수행하는 예에 대해 상세하게 설명하기로 한다.Hereinafter, an example of performing text extraction using tag information stored in the tag information database 130 according to an embodiment of the present invention will be described in detail.

설명의 편의를 위해 도 5와 같은 웹 페이지가 있다고 가정한다. 도 5의 웹 페이지의 내용 중 아래 부분(질문 부분)은 다음과 같은 마크업 언어로 이루어져 있다.For convenience of explanation, it is assumed that there is a web page as shown in FIG. 5. The lower part (question part) of the content of the web page of FIG. 5 consists of the following markup languages.

<P>왜 노벨상이 없을까요?</P>Why are there no Nobel prizes?

<P>필즈상에 대해서도 자세히 써주세요</P><P> Please write more about the Fields Award </ P>

<P>수학계의 노벨상이 라길래...</P><P> The Nobel Prize in mathematics is la ... </ P>

</div></ div>

태그 정보 데이터베이스(130)에는, 도 5의 웹 페이지의 식별자의 적어도 일부와 연관하여 <div>, </div>라는 태그 쌍이 텍스트 추출을 위한 태그 쌍으로서 지정되어 저장되어 있다고 가정한다. 또한, 마우스 포인터는 '왜 노벨상이 없을까요?'의 '없'을 포인팅하였다고 가정한다.In the tag information database 130, it is assumed that tag pairs <div> and </ div> are designated and stored as tag pairs for text extraction in association with at least a part of the identifier of the web page of FIG. 5. It is also assumed that the mouse pointer points to 'no' for 'Why No No No Prize?'

먼저, 앞서 살펴본 바와 같이, 마우스 포인터에 의해 포인팅된 텍스트인 '없'을 둘러싸는 최하위 태그 쌍은 <p>, </p>이므로, 이들이 텍스트 추출을 위한 태그 쌍인지 여부를 판단한다. 그러나, 태그 정보 데이터베이스(130)를 참조하여도 상기와 같은 태그 쌍은 발견되지 않으므로, 이보다 한 단계 상위의 태그 쌍인 <div>, </div>에 대한 판별이 수행된다. 이 태그 쌍은 앞서 언급한 바와 같이, 텍스트 추출을 위한 태그 쌍으로 지정되어 있으므로, 이에 의해 둘러싸인 텍스트가 본 발명의 일 실시예에 따른 텍스트 추출 대상이 된다. 이에 따라, 상기 웹 페이지에서는 <div>, </div>라는 태그 쌍이 둘러싸는,First, as described above, since the lowest tag pair surrounding 'none', which is the text pointed by the mouse pointer, is <p> and </ p>, it is determined whether they are tag pairs for text extraction. However, since the tag pair is not found even with reference to the tag information database 130, discrimination of <div> and </ div>, which is one step higher, is performed. As mentioned above, since the tag pair is designated as a tag pair for text extraction, the text surrounded by the tag pair is a text extraction target according to an embodiment of the present invention. Accordingly, the web page is surrounded by a pair of tags <div>, </ div>,

'수학은 과학에게도 밀접하게 관련이 있으며 많은 학문에서 필요로 하는 중요한 학문인데'Mathematics is closely related to science and is an important discipline that many disciplines

왜 노벨상이 없을까요?Why are there no Nobel Prizes?

필즈상에 대해서도 자세히 써주세요Please write more about the Fields Award.

수학계의 노벨상이 라길래...'Nobel Prize in mathematics ...

라는 부분이 최종적인 텍스트로서 추출되게 된다. 이렇게 추출되는 텍스트는 사용자의 빠른 확인을 위하여 시각적으로 반전된 상태로 디스플레이될 수 있다. 또한, 최종적인 텍스트는 전술한 바와 같이 TTS 서버(300)로 전송되어 음성으로 변환되거나 번역될 수도 있다.Will be extracted as the final text. The extracted text may be displayed in a visually inverted state for quick confirmation by the user. In addition, the final text may be transmitted to the TTS server 300 and converted into voice or translated as described above.

도 6은 본 발명의 일 실시예에 따른 텍스트 추출 및 음성 변환 서비스 제공 방법이 적용되기에 적당한 웹 페이지를 도시한다.6 illustrates a web page suitable for applying a method of providing a text extraction and voice conversion service according to an embodiment of the present invention.

사용자는 도 6과 같은 웹 페이지에서 질문의 제목, 질문의 본문, 답변의 제목 및 답변의 본문을 모두 따로 추출하여 음성 변환 또는 번역 서비스를 받기를 원할 것이다. 이때에, 질문의 본문 또는 답변의 본문이 복수 개의 단락으로 이루어져 있는 경우라 할지라도 이를 하나로 묶어 한꺼번에 음성 변환 또는 번역 서비스를 받기를 원할 것이다.The user may want to extract the title of the question, the main body of the question, the title of the answer, and the main body of the answer from a web page as shown in FIG. 6 to receive a voice conversion or translation service. At this time, even if the body of the question or the body of the answer consists of a plurality of paragraphs, you will want to bundle them together and receive voice conversion or translation services at once.

그렇다면, 본 발명의 일 실시예에 따라, 질문의 제목 및 답변의 제목을 포함하고 있는 태그인 <SPAN>, </SPAN>과 같은 태그 쌍을 소정 텍스트의 추출을 위한 태그 쌍으로 지정하고, 질문의 본문이나 답변의 본문을 포함하는 태그 쌍인 <div>, </div>등의 태그 쌍을 소정 텍스트의 추출을 위한 태그 쌍으로 지정하여 태그 정보 데이터베이스(130)에 저장하게 되면, 사용자가 질문의 제목 영역(1번 영역)이나 답변의 제목 영역(3번 영역)에 마우스를 가져가 마우스오버 이벤트를 발생시킬 때에, 질문의 제목이나 답변의 제목에 해당하는 텍스트가 최종적으로 추출된 후 음성 변 환되어 출력될 수 있다. 그리고 다시, 질문의 본문 영역(2번 영역)이나 답변의 본문 영역(4번 영역)에서 마우스오버 이벤트가 발생되면 같은 논리로 질문의 본문 전체나 답변의 본문 전체를 음성 변환할 수 있다.If so, according to an embodiment of the present invention, a tag pair such as <SPAN>, </ SPAN>, which is a tag including a title of a question and a title of an answer, is designated as a tag pair for extracting predetermined text, and the question is When a tag pair including <body> or </ div>, which is a tag pair including the body of an or the body of an answer, is designated as a tag pair for extracting predetermined text and stored in the tag information database 130, the user may When moving the mouse over the subject area (area 1) or the subject area (area 3) of the answer to generate a mouseover event, the text corresponding to the subject of the question or the title of the answer is extracted and finally converted to voice. Can be output. When the mouseover event occurs in the main body of the question (area 2) or the main body of the answer (area 4), the same logic may be used to convert the entire body of the question or the entire body of the answer.

이상 설명된 본 발명에 따른 실시예들은 다양한 컴퓨터 구성요소를 통하여 수행될 수 있는 프로그램 명령어의 형태로 구현되어 컴퓨터 판독 가능한 기록 매체에 기록될 수 있다. 컴퓨터 판독 가능한 기록 매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 기록 매체에 기록되는 프로그램 명령어는 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능한 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media) 및 ROM, RAM, 플래시 메모리 등과 같은 프로그램 명령어를 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령어의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함된다. 상기 하드웨어 장치는 본 발명에 따른 처리를 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Embodiments according to the present invention described above may be implemented in the form of program instructions that may be executed by various computer components, and may be recorded in a computer-readable recording medium. The computer readable recording medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the recording medium may be those specially designed and constructed for the present invention, or may be known and available to those skilled in the computer software arts. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tape, optical recording media such as CD-ROMs, DVDs, and magneto-optical media such as floptical disks. And hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine code, such as produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter. The hardware device may be configured to operate as one or more software modules to perform the process according to the invention, and vice versa.

이상에서 본 발명이 구체적인 구성요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나, 이는 본 발명의 보다 전반적인 이해를 돕기 위 해서 제공된 것일 뿐, 본 발명이 상기 실시예들에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형을 꾀할 수 있다.Although the present invention has been described by specific embodiments such as specific components and the like, but the embodiments and the drawings are provided only to help a more general understanding of the present invention, the present invention is limited to the above embodiments. However, one of ordinary skill in the art can make various modifications and variations from this description.

따라서, 본 발명의 사상은 상기 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등하게 또는 등가적으로 변형된 모든 것들은 본 발명의 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention should not be limited to the embodiments described above, and all of the equivalents or equivalents of the claims, as well as the claims below, are included in the scope of the spirit of the present invention. I will say.

도 2는 도 1의 TTS 서비스 제공 시스템 중 사용자 컴퓨터의 상세 구성을 나타낸 도면이다.FIG. 2 is a diagram illustrating a detailed configuration of a user computer in the TTS service providing system of FIG. 1.

도 3은 본 발명의 일 실시예에 따라 웹 페이지에 포함되는 텍스트 상에 마우스오버를 할 때에, 소정 영역의 텍스트가 추출되어 음성으로 변환된 후 출력되는 과정을 나타내는 흐름도이다.3 is a flowchart illustrating a process of extracting a text of a predetermined region, converting the text into a voice, and then outputting it when a mouseover is performed on text included in a web page according to an exemplary embodiment of the present invention.

도 4는 본 발명의 일 실시예에 따라 소정 텍스트의 추출을 위한 태그 쌍을 판별하는 과정을 나타내는 흐름도이다.4 is a flowchart illustrating a process of determining a tag pair for extracting a predetermined text according to an embodiment of the present invention.

도 5는 본 발명의 일 실시예에 따라 텍스트 추출의 대상이 될 수 있는 웹 페이지를 예시하는 도면이다.5 is a diagram illustrating a web page that can be a target of text extraction according to an embodiment of the present invention.

<도면의 주요부분에 대한 부호의 설명><Description of the symbols for the main parts of the drawings>

100: 사용자 컴퓨터100: your computer

110: 연산부110: calculator

130: 태그 정보 데이터베이스130: tag information database

150: 프로그램 저장부150: program storage unit

170: 사용자 입력부170: user input unit

190: 출력부190: output unit

300: TTS 서버300: TTS Server

500: 음성 변환 데이터베이스500: speech conversion database

Claims

A method of extracting text based on tag information of markup language used for creating a web page,

Recognizing a text pointer on a web page,

Determining whether at least one tag pair surrounding the text pointed by the text pointer in the web page is a tag pair for extraction of predetermined text, and

Extracting text surrounded by the pair of tags determined in the determining step

How to include.

The method of claim 1,

And the text pointer information is generated by a mouseover event.

The method of claim 2,

The mouse over event is generated when the mouse pointer is stopped for a predetermined time or more in a predetermined area of the web page.

The method of claim 1,

And the determining step is performed sequentially in order of the highest tag pair farthest from the lowest tag pair closest to the pointed text.

The method of claim 1,

The determining step is stopped when even one pair of tag pairs for extracting the predetermined text is detected.

The method of claim 1,

The determining step is performed by referring to a tag information database including information on a tag pair for extracting the predetermined text.

The method of claim 6,

Information about a tag pair for extracting the predetermined text is stored in the tag information database in association with an identifier of the web page.

The method of claim 7, wherein

The identifier of the web page is a URL.

The method of claim 1,

And extracting the text surrounded by the tag pair first determined in the determining step.

As a way to convert text to speech,

10. The method of claim 1, further comprising generating speech data associated with the extracted text according to the method according to any one of the preceding claims.

The method of claim 10,

And the generated voice data is voice data corresponding to the extracted text.

The method of claim 10,

The generated voice data is voice data corresponding to the translated text of the extracted text.

A system for extracting text based on tag information of markup language used for creating a web page,

A text pointer recognizer that recognizes a text pointer on a web page,

A tag pair discrimination unit for determining whether at least one tag pair surrounding the text pointed by the text pointer in the web page is a tag pair for extracting a predetermined text, and

A text extraction unit for extracting text surrounded by the tag pair determined by the tag pair determination unit;

System comprising a.

The method of claim 13,

And the text pointer information is generated by a mouseover event.

The method of claim 14,

The mouseover event is generated when the mouse pointer is stopped for a predetermined time or more in a predetermined area of the web page.

The method of claim 13,

The tag pair determination is performed sequentially in the order of the highest tag pair farthest from the lowest tag pair closest to the pointed text.

The method of claim 13,

The tag pair determination is stopped when a pair of tag pairs for extracting the predetermined text is detected.

The method of claim 13,

And a tag information database including information on a tag pair for extracting the predetermined text.

The tag pair determination is performed with reference to the tag information database.

The method of claim 18,

And the tag information database stores information regarding a tag pair for extracting the predetermined text in association with an identifier of the web page.

The method of claim 19,

The identifier of the web page is a URL.

The method of claim 13,

And the text extracting unit extracts text surrounded by a tag pair first determined by the tag pair determining unit.

A system for extracting text based on tag information of markup language used for creating a web page and converting the text into speech.

A text pointer recognizer that recognizes a text pointer on a web page,

A tag pair determination unit for determining whether at least one tag pair surrounding the text pointed by the text pointer in the web page is a tag pair for extracting predetermined text;

A text extraction unit for extracting text surrounded by the tag pair determined by the tag pair determination unit, and

Voice data generator for generating voice data associated with the text extracted by the text extractor

System comprising a.

The method of claim 22,

And the voice data generated by the voice data generator corresponds to the extracted text.

The method of claim 22,

And the speech data generated by the speech data generator corresponds to text in which the extracted text is translated.

A computer readable recording medium having recorded thereon a computer program for executing the method according to any one of claims 1 to 9.

A computer readable recording medium having recorded thereon a computer program for executing the method according to claim 10.