KR20090020921A

KR20090020921A - Method and apparatus for providing mobile voice web

Info

Publication number: KR20090020921A
Application number: KR1020070085560A
Authority: KR
Inventors: 조정미; 김지연; 송윤경; 곽병관; 김남훈; 한익상
Original assignee: 삼성전자주식회사
Priority date: 2007-08-24
Filing date: 2007-08-24
Publication date: 2009-02-27
Also published as: US9251786B2; KR101359715B1; US20090055179A1

Abstract

A method and an apparatus for providing a mobile voice web are provided to limit an unlimited grammar necessary for voice recognition by generating a grammar suitable for a web context of a user, thereby implementing effective voice recognition driven in a terminal not a server. A contents data managing unit(110) analyzes a web history of a user from web search logs of the user and generates a voice connection list based on the analyzed result. A dynamic grammar generating unit(120) produces a voice recognition grammar to which the voice connection list is reflected. A voice interpreting unit(130) generates a web command by matching input voice of the user with the generated voice recognition grammar.

Description

Method and apparatus for providing mobile voice web

본 발명은 모바일 음성 웹 제공 방법 및 장치에 관한 것으로, 더 상세하게는, 음성을 이용하여 모바일 환경에서 웹 접속, 웹 내비게이션 및 웹 검색을 쉽고 빠르게 수행할 수 있는 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for providing a mobile voice web, and more particularly, to a method and apparatus for easily and quickly performing web access, web navigation, and web search in a mobile environment using voice.

모바일 인터넷 환경이 일반화되면서 모바일 웹 검색이나 웹을 통해 모바일 단말기에 콘텐츠를 다운로드 하는 것이 빈번해지고 있다. 그러나 단말기의 버튼 입력 방식은 버튼의 소형화로 검색어 입력이 불편하고 속도도 느리다. 또한, 방향키를 이용한 웹 페이지 내비게이션은 속도가 느리고, 커서 이동과 버튼 누름의 동기화가 잘 맞지 않아, 효율적인 웹 내비게이션이 어렵다. As the mobile internet environment is generalized, it is becoming more frequent to download contents to mobile terminals through mobile web search or web. However, the button input method of the terminal is inconvenient and slow to input the search word due to the miniaturization of the button. In addition, web page navigation using direction keys is slow, and synchronization of cursor movement and button press is not well-matched, which makes it difficult to efficiently navigate the web.

전술한 모바일 인터넷 환경에서의 웹 검색 등을 용이하게 하기 위해서, 음성 인식을 이용한 웹 검색 기술들이 개발되고 있다. 기존의 음성을 이용한 웹 검색은 서버-클라이언트 방식이 제안되고 있다. 다음과 같은 서버-클라이언트 방식의 웹 접속, 검색 기술들이 공지되어 있다. In order to facilitate web searching and the like in the mobile Internet environment described above, web search techniques using voice recognition have been developed. In the conventional web search using voice, a server-client method has been proposed. The following server-client type web access and retrieval techniques are known.

국내등록특허 제0486030호는 음성인식을 이용한 이동 무선 단말기의 인터넷 사이트 접속 장치 및 방법에 관한 기술로서, 단말에서 입력된 음성을 음성인식 서버에서 인식하여 원하는 인터넷 사이트로 이동하고, 음성 입력과 함께 전송된 URL로부터 멀티미디어 서버는 음성 인식 문법을 매핑하여 음성인식 서버로 전달하는 기술을 개시하고 있다.Korean Patent No. 0486030 relates to an apparatus and method for accessing an internet site of a mobile wireless terminal using voice recognition, and recognizes the voice input from the terminal to a desired internet site, and transmits it with a voice input. The multimedia server discloses a technology of mapping a speech recognition grammar from the URL to be delivered to the speech recognition server.

국내공개특허 제2000-0087281호는 음성인식을 이용한 이동 무선 단말기의 인터넷 검색 방법에 관한 기술로서, 사용자의 음성 데이터를 등록하는 음성 인식 서버를 통하여 별도의 음성인식 모듈 없이 음성으로 인터넷 검색하고, 사용자의 등록된 음성 데이터 비교에 의한 검색어 인식하고, 등록되지 않은 음성 데이터는 DB화 알고리즘을 통하여 패턴화시키는 기술을 개시하고 있다. Korean Patent Laid-Open Publication No. 2000-0087281 is a technology related to an internet search method of a mobile wireless terminal using voice recognition, and searches the Internet by voice without a separate voice recognition module through a voice recognition server that registers user's voice data. Discloses a technique of recognizing a search word by comparing registered voice data and patterning unregistered voice data through a DB algorithm.

하지만, 전술한 기술들은 음성 입력은 단말에서 이루어지더라도 음성 인식은 단말이 아닌 통신망을 이용하여 서버에서 이루어진다. 따라서 대량의 컴퓨팅 자원과 문법을 요구하는 대용량 음성 인식 엔진이 필요하고, 사용자 음성의 인식을 위해 통신망을 이용할 경우, 통신망 사용에 따른 사용자의 경제적 부담과 함께 통신 속도, 통신망 상황에 의존적이라는 한계가 있다. However, in the above-described techniques, even though voice input is performed at a terminal, voice recognition is performed at a server using a communication network rather than a terminal. Therefore, a large speech recognition engine that requires a large amount of computing resources and grammar is required, and when using a communication network for the recognition of the user's voice, there is a limitation that it is dependent on the communication speed and the network situation as well as the economic burden of the user. .

또한, 사용자의 개별 웹 히스토리를 반영하지 않고, 모든 사용자에게 동일한 음성 인식 모델을 적용함으로써, 개인적인 환경에서 주로 사용되는 모바일 단말의 특징을 반영하지 못하였다.In addition, by applying the same speech recognition model to all users without reflecting the user's individual web history, it did not reflect the characteristics of the mobile terminal mainly used in the personal environment.

본 발명은 음성을 이용하여 모바일 환경에서 웹 검색을 쉽고 빠르게 수행할 수 있는 방법 및 장치를 제공하는 데 목적이 있다. An object of the present invention is to provide a method and apparatus for easily and quickly performing a web search in a mobile environment using voice.

특히, 본 발명은 사용자의 웹 히스토리를 반영하여 단말 내에서 음성 인식 문법을 동적으로 생성, 관리하여 모바일 단말 내에서의 직접 웹 접속, 웹 내비게이션 및 웹 검색을 위한 방법 및 장치를 제공하는 데 목적이 있다.In particular, an object of the present invention is to provide a method and apparatus for direct web access, web navigation and web search in a mobile terminal by dynamically generating and managing a speech recognition grammar in the terminal reflecting the user's web history. have.

본 발명의 기술적 과제를 달성하기 위한 모바일 단말에서의 음성 웹 제공 방법은 사용자의 웹 검색 로그들로부터 사용자의 웹 히스토리를 분석하고, 분석 결과를 기초로 음성 접속 리스트를 생성하고, 생성한 음성 접속 리스트를 반영한 음성 인식 문법을 생성하고, 사용자의 입력 음성을 생성한 음성 인식 문법에 매칭하여 웹 명령을 생성하여 이루어진다.In accordance with another aspect of the present invention, a method for providing a voice web in a mobile terminal analyzes a user's web history from user's web search logs, generates a voice access list based on the analysis result, and generates a voice access list. By generating a speech recognition grammar reflecting the, and by matching the speech recognition grammar generated the input voice of the user to generate a web command.

본 발명의 다른 기술적 과제를 달성하기 위한 모바일 단말에서의 음성 웹 제공 장치는 사용자의 웹 검색 로그들로부터 사용자의 웹 히스토리를 분석하고, 분석 결과를 기초로 음성 접속 리스트를 생성하는 콘텐츠 데이터 관리부와, 생성한 음성 접속 리스트를 반영한 음성 인식 문법을 생성하는 동적 문법 생성부와, 사용자의 입력 음성을 상기 생성한 음성 인식 문법에 매칭하여 웹 명령을 생성하는 음성 해석부를 포함한다.In another aspect, an apparatus for providing a voice web in a mobile terminal may include: a content data manager configured to analyze a user's web history from user's web search logs and generate a voice access list based on an analysis result; A dynamic grammar generation unit for generating a speech recognition grammar reflecting the generated speech connection list, and a speech analysis unit for generating a web command by matching the input speech of the user to the generated speech recognition grammar.

본 발명의 또 다른 기술적 과제를 달성하기 위한 상기 방법을 컴퓨터에서 실 행시키기 위한 프로그램을 기록한 기록매체를 포함한다.A recording medium having recorded thereon a program for executing the above method on a computer for achieving another technical object of the present invention.

본 발명의 세부 및 개선 사항은 종속항에 개시된다.Details and improvements of the invention are disclosed in the dependent claims.

본 발명의 일 실시 예에 따른 모바일 단말에서의 음성 웹 제공 방법은 사용자의 웹 검색 로그들로부터 사용자의 웹 히스토리를 분석하고, 그 결과를 기초로 음성 접속 리스트를 생성하고, 이를 반영한 음성 인식 문법을 동적으로 생성하여 음성 인식을 수행함으로써, 사용자의 웹 컨텍스트에 적합한 문법을 생성하여 음성 인식에 필요한 무제한 문법을 제한함으로써 서버가 아닌 단말에서도 구동 가능한 효율적인 음성 인식을 구현할 수 있다.According to an exemplary embodiment of the present invention, a method of providing a voice web in a mobile terminal analyzes a user's web history from web search logs of a user, generates a voice access list based on the result, and generates a speech recognition grammar based on the result. By dynamically generating and performing speech recognition, an efficient speech recognition that can be driven in a terminal other than a server can be realized by generating a grammar suitable for a web context of a user and limiting an unlimited grammar required for speech recognition.

또한, 사용자의 웹 로그 분석 결과를 문법 생성시 반영함으로써 사전에 등록되지 않은 단어의 인식 성공률을 높일 수 있는 효과가 있고, 단말 내에서 사용자 음성 입력을 인식함으로써 통신망에 의존적이지 않은 음성 인식 서비스를 제공할 수 있다. In addition, by reflecting the web log analysis results of the user when generating the grammar, it is effective to increase the recognition success rate of words not registered in advance, and provides a voice recognition service that is not dependent on the communication network by recognizing the user's voice input in the terminal. can do.

이하, 첨부한 도면들을 참조하여 본 발명의 바람직한 실시 예들을 상세히 설명한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시 예에 따른 모바일 단말에서의 음성 웹 제공 장치(100)의 개략적인 블록도이다.1 is a schematic block diagram of an apparatus for providing a voice web in a mobile terminal according to an embodiment of the present invention.

도 1을 참조하면, 모바일 단말에서의 음성 웹 제공 장치(100)는 콘텐츠 데이터 관리부(110), 동적 문법 생성부(120) 및 음성 해석부(120)를 포함한다.Referring to FIG. 1, an apparatus 100 for providing a speech web in a mobile terminal includes a content data manager 110, a dynamic grammar generator 120, and a speech interpreter 120.

음성 웹 제공 장치(100)는 모바일 단말, 예를 들면 이동통신 단말기, PDA 등에서 동작할 수 있는 장치이다. 본 발명의 바람직한 실시 예에 따른 모바일 단말은 사용자의 음성을 인식하여 무선 인터넷 사이트에 직접 접속하고, 웹 내비게이션 및 웹 검색이 가능하다.The voice web providing apparatus 100 is a device capable of operating in a mobile terminal, for example, a mobile communication terminal or a PDA. The mobile terminal according to an exemplary embodiment of the present invention recognizes a user's voice and directly connects to a wireless Internet site, and enables web navigation and web search.

콘텐츠 데이터 관리부(110)는 사용자의 웹 검색 로그들로부터 사용자의 웹 히스토리를 분석하고, 분석한 결과를 바탕으로 음성 접속 리스트(voice access list)를 생성한다.The content data manager 110 analyzes the web history of the user from the web search logs of the user and generates a voice access list based on the analysis result.

동적 문법 생성부(120)는 콘텐츠 데이터 관리부(110)로부터 음성 접속 리스트를 제공받아, 이를 반영한 음성 인식 문법을 동적으로 생성한다.The dynamic grammar generation unit 120 receives the voice access list from the content data management unit 110 and dynamically generates the speech recognition grammar reflecting the voice access list.

음성 해석부(130)는 사용자로부터 음성을 입력받아, 동적 문법 생성부(120)에서 생성한 음성 인식 문법을 매칭하여 웹 명령을 생성한다. 여기서, 웹 명령은 웹 사이트에 직접 접속하기 위한 특정 사이트의 URL 정보를 포함한다.The speech interpreter 130 receives a voice from the user and generates a web command by matching the speech recognition grammar generated by the dynamic grammar generator 120. Here, the web command includes URL information of a specific site for directly accessing the web site.

도 2는 본 발명의 다른 실시 예에 따른 콘텐츠 데이터 관리부(110)의 개략적인 블록도이다.2 is a schematic block diagram of a content data management unit 110 according to another embodiment of the present invention.

도 2를 참조하면, 콘텐츠 데이터 관리부(110)는 사용자의 웹 사이트 방문 횟수와 방문 시간 분포를 분석하는 웹 검색 로그 분석부(200)와 웹 검색 로그 분석부(200)에서 분석한 결과를 이용하여 음성 접속 리스트를 생성하는 음성 접속 리스트 생성부(210)를 포함하여 구성할 수 있다.Referring to FIG. 2, the content data management unit 110 uses the results of the web search log analysis unit 200 and the web search log analysis unit 200 to analyze the number of visits to the web site and the distribution of visit times. The voice access list generator 210 may be configured to generate a voice access list.

또한, 도면에 도시된 것처럼, 콘텐츠 데이터 관리부(110)는 음성으로 직접 웹 사이트에 접속하기 위한 구성뿐만 아니라, 웹 내비게이션과 웹 검색을 위한 구 성으로서, 사용자의 모바일 단말이 접속한 사이트의 URL로부터 사이트의 분야를 분류하는 웹 사이트 분류부(220), 모바일 단말이 접속한 사이트에 상응하는 웹 콘텐츠들의 html 소스를 분석하는 웹 콘텐츠 분석부(230) 및 웹 콘텐츠 분석부(230)에서 분석한 결과로부터 접속한 사이트에서 웹 내비게이션 및 웹 검색할 수 있는 링크 텍스트를 추출하는 링크 텍스트 추출부(240)를 더 포함하여 구성할 수 있다.In addition, as shown in the figure, the content data management unit 110 is configured not only for directly accessing the web site by voice, but also for web navigation and web search, from the URL of the site to which the user's mobile terminal is connected. Analysis results by the web site classification unit 220 for classifying the field of the site, the web content analysis unit 230 and the web content analysis unit 230 analyzing the html source of the web contents corresponding to the site accessed by the mobile terminal. It may be configured to further include a link text extractor 240 for extracting the web navigation and web searchable link text from the site connected from the.

본 발명의 바람직한 실시 예는 모바일 단말에서 음성에 의한 직접 웹 접속(direct web access), 웹 내비게이션(web navigation) 및 웹 검색(web search)을 할 수 있다. 여기서, 웹 접속은 웹 브라우저의 주소창에 URL을 입력하여 특정 사이트에 접속하는 것을 의미하고, 직접 웹 접속은 사이트 명을 발성하는 것으로 바로 해당 사이트에 접속할 수 있음을 의미한다. 웹 내비게이션은 웹 브라우저의 현재 페이지에 하이퍼 텍스트로 연결된 링크를 선택하는 것으로, 본 발명의 바람직한 실시 예에서는 하이퍼 텍스트 발성으로 해당 링크를 선택할 수 있다. 웹 검색은 검색창에 원하는 검색어를 입력하여 원하는 정보를 검색하는 것으로, 음성 검색어 입력을 통해 원하는 검색을 수행할 수 있다.According to a preferred embodiment of the present invention, direct web access, web navigation, and web search by voice may be performed in a mobile terminal. Here, web access means accessing a specific site by entering a URL in a web browser address bar, and direct web access means that a user can directly access a corresponding site by uttering a site name. Web navigation is to select a hypertext link to the current page of the web browser, in a preferred embodiment of the present invention can select the link to the hypertext utterance. In the web search, a user inputs a desired search term in a search box to search for desired information, and a desired search may be performed through a voice search term input.

웹 검색 로그 분석부(200)는 사용자 웹 로그 정보들을 분석하여 일정 기간 내에 사용자가 방문했던 사이트를 분석하여 음성 접속 리스트를 생성한다. 또한, 웹 검색 로그 분석부(200)는 음성에 의한 직접 웹 접속을 위하여 일반적으로 사용자들의 방문 빈도가　많은 웹 사이트를 선택하여 디폴트 사이트(default site) 리스트를 구축한다. 또한, 사용자가 직접 즐겨찾기로 등록한 사이트를 정보를 이용하여 북마크 리스트도 반영한다. 　The web search log analyzer 200 analyzes user web log information and analyzes a site visited by a user within a predetermined period of time to generate a voice access list. In addition, the web search log analysis unit 200 builds a default site list by selecting a web site that is frequently visited by users for direct web access by voice. In addition, the bookmark list is also reflected using the information that the user directly registered as a favorite.

웹 검색 로그 분석부(200)는 사용자의 웹 사이트 방문 횟수 및 방문 시간 분포를 반영하여 사용자 방문 사이트 리스트를 생성하는 데, 모바일 단말 내에 저장된 웹 히스토리로부터 방문 사이트의 인터넷 주소와 제목, 접속 시간, 빈도 등을 추출한다. 사용자 방문 사이트 리스트를 생성하기 위하여 아래 수학식 1내지 3을 통해 계산한 스코어가 높은 순서대로 리스트를 생성한다.The web search log analyzer 200 generates a list of user visited sites by reflecting the number of visits to the website and the distribution of visit times. The Internet address, title, access time, and frequency of the visited sites from the web history stored in the mobile terminal. Extract the back. In order to generate a user visited site list, the list is generated in the order of the highest scores calculated through Equations 1 to 3 below.

사이트 s_i의 접속 빈도 계산은 다음 수학식 1과 같다.The calculation of the access frequency of the site s _i is given by the following equation.

여기서, F(s_i)는 웹 히스토리 내에서의 사이트 s_i의 발생 빈도이다. Where F (s _i ) is the frequency of occurrence of site s _i in the web history.

또한, 접속 분포 계산은 다음 수학식 2와 같다.In addition, the connection distribution calculation is as follows.

여기서, P_t(s)는 시간 t에서 측정한 P(s)이다.Where P _t (s) is P (s) measured at time t.

앞서 계산한 사이트 접속 빈도와 접속 분포를 반영한 스코어 함수는 다음 수학식 3과 같다.The score function reflecting the previously calculated site access frequency and access distribution is shown in Equation 3 below.

여기서, α^* 와 β^* 는 접속 빈도와 접속 분포에 대한 가중치이다.Where α ^* and β ^* are weights for the connection frequency and the connection distribution.

상기 수학식 3에서 계산한 스코어가 높은 순서대로 사용자 방문 사이트 리스트를 생성한다.The user visited site list is generated in ascending order of the score calculated in Equation (3).

또한, 바람직하게 음성 접속 리스트는 디폴트 사이트 리스트와 북마크 리스트를 함께 상기 수학식 3에서 계산한 사용자 방문 사이트 리스트를 통해 업데이트 한다.In addition, the voice access list preferably updates the default site list and the bookmark list together through the user visited site list calculated in Equation 3 above.

웹 내비게이션과 웹 검색을 위해, 콘텐츠 데이터 관리부(110)는 웹 사이트 분류부(220), 웹 콘텐츠 분석부(230) 및 링크 텍스트 추출부(240)를 더 포함하여 구성될 수 있다.For web navigation and web search, the content data manager 110 may further include a web site classifier 220, a web content analyzer 230, and a link text extractor 240.

음성에 의한 웹 내비게이션을 지원하기 위해서는 현재 페이지 상에서 사용자가 음성으로 선택할 수 있는 링크 텍스트를 추출하여야 한다. 링크 텍스트는 해당 페이지의 html 소스의 태그를 분석하여 추출 가능하다. 여기서, html 문서에서 링크 텍스트는 태그 <A>로 표시되며, URL은 href의 값으로 나타난다. 따라서 간단한 태그 분석으로 사용자가 선택할 수 있는 링크 텍스트와 해당 링크 텍스트의 URL을 추출할 수 있다.In order to support voice-based web navigation, link text that a user can select by voice must be extracted from the current page. The link text can be extracted by analyzing the tags in the html source of the page. Here, the link text in the html document is represented by the tag <A>, and the URL is represented by the value of href. Therefore, simple tag analysis can extract the user-selectable link text and the URL of the link text.

또한, 음성에 의한 웹 검색을 지원하기 위해서는 무제한 음성 인식 기술이 필요하며, 본 발명의 바람직한 실시 예에서는 웹 페이지를 분류하고 해당 카테고리에 특화된 검색어 리스트를 음성 인식 문법에 추가함으로써 음성에 의한 웹 검색을 가능하게 한다. 예를 들면, 사용자가 쇼핑 사이트에 접속해서는 주로 해당 쇼핑 사이트에서 제공하는 상품 검색이나 주문, 결재 등 쇼핑과 관련된 검색을 원하는 경우가 많다. 따라서 현재 웹 사이트가 쇼핑 카테고리인 경우, 쇼핑과 관련되어 미리 정의된 어휘를 문법에 추가한다. In addition, in order to support web search by voice, unlimited speech recognition technology is required. In a preferred embodiment of the present invention, web search by voice is performed by classifying web pages and adding a list of search terms specific to the corresponding category to a voice recognition grammar. Make it possible. For example, when a user accesses a shopping site, the user often wants a search related to shopping such as a product search, an order, and a payment provided by the shopping site. Therefore, if the current website is a shopping category, add a predefined vocabulary related to shopping to the grammar.

본 발명의 바람직한 실시 예에서는 URL과 해당 웹 페이지의 타이틀 분석으로 해당 웹 사이트의 분야를 분류하고, 현재 웹 사이트가 분야 특화된 사이트가 아닌 경우, 포털 사이트로부터 인기 검색어 리스트를 추출한다. 웹 사이트 분류 카테고리로는, 예를 들면, 뉴스, 증권, 영화, 음악, 쇼핑, 여행 등이 있다.According to a preferred embodiment of the present invention, the field of the web site is classified by URL and title analysis of the web page, and if the current web site is not a field-specific site, a list of popular search terms is extracted from the portal site. Web site classification categories include, for example, news, stocks, movies, music, shopping, travel, and the like.

도 3은 본 발명의 또 다른 실시 예에 따른 동적 문법 생성부(120)의 개략적인 블록도이다.3 is a schematic block diagram of a dynamic grammar generation unit 120 according to another embodiment of the present invention.

도 3을 참조하면, 동적 문법 생성부(120)는 콘텐츠 데이터 관리부(110)에서 생성한 음성 접속 리스트를 반영한 실제 음성 인식의 리소스인 문법을 동적으로 생성한다.Referring to FIG. 3, the dynamic grammar generation unit 120 dynamically generates a grammar that is a resource of actual speech recognition that reflects the voice access list generated by the content data management unit 110.

또한, 동적 문법 생성부(120)는 사이트 분류 결과와 추출한 링크 텍스트를 이용하여 사용자의 음성 입력의 의도를 분석하는 사용자 의도 분석부(300), 생성한 음성 접속 리스트와 추출한 링크 텍스트로부터 키워드를 추출하는 키워드 추출부(310) 및 사용자의 의도에 따른 음성 인식 문법을 생성하는 문법 생성부(320)를 포함하여 구성될 수도 있다.In addition, the dynamic grammar generation unit 120 extracts a keyword from the user intention analysis unit 300 analyzing the intention of the user's voice input using the site classification result and the extracted link text, and the generated voice access list and the extracted link text. It may be configured to include a keyword extraction unit 310 and a grammar generation unit 320 for generating a speech recognition grammar according to the user's intention.

사용자 의도 분석부(300)는 단말의 상태와 사이트 특성을 반영하여 입력 음성에 대한 사용자 의도를 파악한다. 즉, 사용자 의도에 따라 음성인식을 위한 문법을 동적으로 생성하기 위한 구성이다. 여기서, 사용자의 의도, 예를 들면 사이트 의 특성에 따라 동적으로 문법을 생성하는데, 일반적인 포털 사이트인 경우에는 인기 검색어 리스트, 특정 분야에 특화된 사이트인 경우, 예를 들면, 쇼핑몰이라면, 쇼핑몰 내 상품 관련 검색어 리스트를 생성하고, 증권 사이트인 경우에는 주식 시세 검색을 위한 등록된 기업 리스트를 생성하고, 영화 사이트인 경우에는 영화 제목, 배우 등의 리스트를 생성한다.The user intention analyzer 300 determines the user intention of the input voice by reflecting the state of the terminal and the site characteristics. That is, it is a configuration for dynamically generating grammar for speech recognition according to user intention. Here, the grammar is dynamically generated according to the user's intention, for example, the characteristics of the site. In the case of a general portal site, a list of popular search terms and a site specialized for a specific field, for example, a shopping mall, is related to a product in a shopping mall. A search term list is generated, and in the case of a stock site, a list of registered companies for stock price search is generated. In the case of a movie site, a list of movie titles and actors is generated.

사용자 의도를 분석하기 위한 방법은 도 4를 참조하여 후술한다.A method for analyzing user intention is described below with reference to FIG. 4.

키워드 추출부(310)는 사이트 리스트와 링크 텍스트, 검색어 리스트에서 의미 없는 기호 제거를 제거하고, 정제된 텍스트로부터 실제 사용자 발화 후보 추출한다. 즉, 사이트 명이나 링크 텍스트, 검색어 전체를 발성하지 않고 부분 어휘만을 발성해도 인식될 수 있도록 텍스트로부터 키워드 추출을 수행한다. 콘텐츠 데이터 관리부(110)에서 생성된 리스트에 대해 의미 없는 기호를 제거한 뒤, 형태소 분석이나 어휘 기반 분석 등을 적용하여 띄어쓰기 단위나 형태소 단위로 추출한다.The keyword extracting unit 310 removes meaningless symbols from the site list, the link text, and the search word list, and extracts the actual user speech candidate from the purified text. That is, keyword extraction is performed from the text so that it can be recognized even if only the partial vocabulary is spoken without the site name, the link text, and the entire search word. After the meaningless symbols are removed from the list generated by the content data management unit 110, morphological analysis or lexical-based analysis is applied to extract a space or a morpheme unit.

문법 생성부(320)는 사이트 리스트와 링크 텍스트로부터 추출된 키워드로부터 음성인식을 위한 문법을 생성한다. 또한,　현재 웹 사이트의 카테고리에 해당하는 검색어 리스트로 문법을 업데이트 한다.The grammar generation unit 320 generates a grammar for speech recognition from keywords extracted from the site list and the link text. It also updates the grammar with a list of search terms that correspond to the categories of the current Web site.

도 4는 본 발명의 또 다른 실시 예에 따른 사용자 의도 추출 방법을 설명하기 위한 흐름도이다.4 is a flowchart illustrating a user intention extraction method according to another embodiment of the present invention.

도 4를 참조하면, 단계 400에서, 모바일 단말의 웹 브라우저가 실행중인지 여부를 판단한다. 웹 브라우저가 실행중이지 않다면, 단계 402에서, 사용자가 직접 사이트 접속을 위한 음성을 발화할 가능성이 큰 경우이므로, 단계 404에서, 음 성 접속 리스트로 문법을 생성하도록 한다.Referring to FIG. 4, in step 400, it is determined whether the web browser of the mobile terminal is running. If the web browser is not running, then in step 402, the user is likely to speak a voice for direct site access, so in step 404, a grammar is generated from the voice access list.

웹 브라우저가 실행되어 있는 상황인 경우에는, 웹 직접 접속, 웹 내비게이션 및 웹 검색이 모두 가능하다. 단계 406에서, 특정 웹 사이트인지 여부를 판단한다. 이 경우, 현재 웹 사이트의 분류에 따라 검색어 리스트를 제한할 수 있다. 즉, 현재 웹 사이트가 증권, 영화와 같이 분야가 특화된 사이트인 경우, 해당 분야에 특화된 검색어 리스트로, 분야 특화된 사이트가 아닌 경우, 일반 인기 검색어 리스트로 검색어 범위를 제한할 수 있다. 특정 웹 사이트가 아닌 경우, 단계 408로 진행하여, 사용자의 의도가 직접 사이트 접속, 웹 내비게이션, 일반 웹 검색이라고 판단하고, 단계 410에서, 음성 접속 리스트, 링크 텍스트, 인기 검색어 리스트로 문법을 생성하도록 한다. 한편, 특정 웹 사이트인 경우에는 단계 412로 진행하여, 사용자의 의도가 직접 사이트 접속, 웹 내비게이션 및 특정 웹 검색이라고 판단하고, 음성 접속 리스트, 링크 텍스트, 특정 쿼리 리스트로 문법을 생성하도록 한다. 이어, 키워드 추출부(310)에서, 문법 생성을 위해 생성한 리스트로부터 키워드를 추출한다.In a situation where a web browser is running, direct web access, web navigation and web search are all possible. In step 406, it is determined whether or not it is a specific web site. In this case, the list of search terms may be restricted according to the classification of the current web site. That is, if the current web site is a site-specific site such as a stock or a movie, the search word range may be limited to a search word list specialized in the corresponding field, and the general popular search word list if the field is not a site-specific site. If it is not a specific web site, the process proceeds to step 408 where it is determined that the user's intention is direct site access, web navigation, or general web search, and in step 410, the grammar is generated from the voice access list, the link text, and the popular search term list. do. In the case of a specific web site, the process proceeds to step 412 where it is determined that the user's intention is direct site access, web navigation, and specific web search, and the grammar is generated from the voice access list, the link text, and the specific query list. Next, the keyword extraction unit 310 extracts a keyword from the list generated for grammar generation.

도 5는 본 발명의 또 다른 실시 예에 따른 음성 해석부(130)의 개략적인 블록도이다.5 is a schematic block diagram of the speech analyzer 130 according to another exemplary embodiment.

도 5를 참조하면, 음성 해석부(130)는 사용자의 입력 음성을 생성한 음성 인식 문법에 매칭하여 웹 명령을 생성한다. 음성 해석부(130)는 음성 인식부(500) 및 웹 명령 생성부(510)를 포함한다.Referring to FIG. 5, the speech analyzer 130 generates a web command by matching a speech recognition grammar that generates an input speech of a user. The speech interpreter 130 includes a speech recognizer 500 and a web command generator 510.

음성 해석부(130)는 동적 문법 생성부(120)에서 생성된 문법을 적용하여 음 성 인식을 수행하고, 음성 인식 결과로부터 사용자의 웹 명령을 생성한다. 음성 인식부(500)는 음소 검출기(도시되지 않음)의 출력인 후보 음소 열과 문법 사이의 부분 매칭을 통해 매칭 점수가 높은 후보 목록을 검색하여 인식 결과로 출력한다. 웹 명령 생성부(510)는 음성 인식된 결과로부터 실제 사용자 의도를 수행하기 위한 웹 명령을 생성한다. 여기서, 웹 명령은 직접 웹 접속, 웹 내비게이션 및 웹 검색을 포함한다. 직접 웹 접속을 위한 웹 명령은 해당 사이트에 대해 등록된 URL로 대체하는 것이고, 웹 내비게이션을 위한 웹 명령은 음성 입력에 해당하는 링크 텍스트의 href URL로 대체하는 것이고, 웹 검색을 위한 명령은 음성 쿼리(query)를 검색어로 대체하는 것이다.The speech interpreter 130 performs speech recognition by applying the grammar generated by the dynamic grammar generator 120 and generates a user's web command from the speech recognition result. The speech recognizer 500 searches for a candidate list having a high matching score through partial matching between the candidate phoneme sequence and the grammar, which are outputs of a phoneme detector (not shown), and outputs the recognition list as a recognition result. The web command generation unit 510 generates a web command for performing the actual user intention from the speech recognized result. Here, the web commands include direct web access, web navigation, and web search. The web command for direct web access replaces the registered URL for the site, the web command for web navigation substitutes the href URL of the link text corresponding to the voice input, and the command for web search uses the voice query. replace (query) with a query.

도 6은 본 발명의 또 다른 실시 예에 따른 음성 웹 제공을 위한 문법(600)의 예시이다.6 is an example of a grammar 600 for providing a voice web according to another embodiment of the present invention.

도 6을 참조하면, 동적 문법(600)이 예시적으로 도시되어 있다. 문법(600)은 음성 접속 리스트(610), 링크 키워드 리스트(620), 인기 검색어 리스트(630)로 구성되어 있다. 또한, 음성 접속 리스트(610)는 "네이버", "네이버 블로그", "구글" 등과 같은 디폴트 사이트와, 사용자가 즐겨찾기에서 추가한 정보로부터 획득한 북마크 리스트에서 추가된 리스트, 예를 들면, "기상청 홈페이지", "텐바이텐 감성 채널 에너지" 등으로 구성되어 있다.Referring to FIG. 6, a dynamic grammar 600 is illustrated by way of example. The grammar 600 includes a voice access list 610, a link keyword list 620, and a popular search term list 630. In addition, the voice access list 610 may include a default site such as "Naver", "Naver Blog", "Google", and the like, and a list added from a bookmark list obtained from information added to Favorites by the user, for example, " Meteorological Agency homepage "," Tenbaiten emotional channel energy "and the like.

도 7은 본 발명의 또 다른 실시 예에 따른 모바일 단말에서의 음성 웹 제공의 전체 흐름을 설명하기 위한 도면이다.7 is a view for explaining the overall flow of the voice web providing in the mobile terminal according to another embodiment of the present invention.

도 7을 참조하면, 단계 700에서, 사용자가 직접 웹 접속을 위해 모바일 단말 을 통해 사용자가 "네이버"라고 발성하면, 네이버 사이트에 접속한다. 단계 702에서, 사용자가 현재 네이버 사이트에서 웹 내비게이션을 위해, "손예진"이라고 발성하면, 단계 704에서, 현재 웹 사이트에서 손예진과 관련한 링크 텍스트를 추출하여 손예진과 관련한 하이퍼 링크로 접속하게 된다. 이어, 해당 링크 페이지에서, 사용자가 "지마켓"이라고 발성하면, 해당 지마켓 사이트로 직접 사이트 접속을 수행하여, 단계 706에서, 지마켓 사이트를 열어준다. 현재 지마켓 사이트에서 사용자가 "PDP TV" 또는 "공동 구매"라고 발성하면, 웹 검색 또는 웹 내비게이션을 통해 PDP TV와 하이퍼 링크된 사이트를 열어주거나 공동 구매 사이트를 열어준다. 여기서, 현재 접속한 사이트, 예를 들면 지마켓은 인터넷 쇼핑에 특화된 사이트이므로, 쇼핑과 관련한 검색어로 문법을 동적으로 구성하여 음성 인식의 성능을 높일 수 있다.Referring to FIG. 7, in step 700, when the user speaks "Naver" through the mobile terminal for direct web access, the user accesses the NAVER site. In step 702, if the user speaks "hands-off" for web navigation on the current Naver site, in step 704, the link text related to hand-hands is extracted from the current web site and connected to a hyperlink related to hand-hands. Subsequently, if the user speaks "Gmarket" in the link page, the site is directly connected to the Gmarket site, and in step 706, the Gmarket site is opened. When a user speaks "PDP TV" or "co-purchase" on the current Gmarket site, they open a hyper-linked site or a co-purchase site through web search or web navigation. Here, since the currently connected site, for example, Gmarket, is a site specialized for internet shopping, the performance of speech recognition can be improved by dynamically constructing grammars with search terms related to shopping.

한편, 본 발명은 컴퓨터로 읽을 수 있는 기록 매체에 컴퓨터가 읽을 수 있는 코드로 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록 매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다.Meanwhile, the present invention can be embodied as computer readable codes on a computer readable recording medium. The computer-readable recording medium includes all kinds of recording devices in which data that can be read by a computer system is stored.

컴퓨터가 읽을 수 있는 기록 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등이 있으며, 또한 캐리어 웨이브(예를 들어 인터넷을 통한 전송)의 형태로 구현하는 것을 포함한다. 또한, 컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다. 그리고 본 발명을 구현하기 위한 기능적인(functional) 프로그램, 코드 및 코드 세그먼트들은 본 발명이 속하는 기술 분야의 프로그래머들에 의하여 용이하게 추론될 수 있다.Examples of computer-readable recording media include ROM, RAM, CD-ROM, magnetic tape, floppy disks, optical data storage devices, and the like, which may also be implemented in the form of carrier waves (for example, transmission over the Internet). Include. The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. And functional programs, codes and code segments for implementing the present invention can be easily inferred by programmers in the art to which the present invention belongs.

이제까지 본 발명에 대하여 바람직한 실시 예를 중심으로 살펴보았다. 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자는 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 본 발명을 구현할 수 있음을 이해할 것이다. 그러므로 상기 개시된 실시 예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 한다.So far I looked at the center of the preferred embodiment for the present invention. Those skilled in the art will understand that the present invention can be embodied in a modified form without departing from the essential characteristics of the present invention. Therefore, the disclosed embodiments should be considered in descriptive sense only and not for purposes of limitation. The scope of the present invention is shown not in the above description but in the claims, and all differences within the scope should be construed as being included in the present invention.

<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for the main parts of the drawings>

100: 음성 웹 제공 장치 110: 콘텐츠 데이터 관리부100: voice web providing device 110: content data management unit

120: 동적 문법 생성부 130: 음성 해석부120: dynamic grammar generation unit 130: speech analysis unit

200: 웹 검색 로그 분석부 210: 음성 접속 리스트 생성부200: Web search log analysis unit 210: Voice access list generation unit

220: 웹 사이트 분류부 230: 웹 콘텐츠 분석부220: Web site classification unit 230: Web content analysis unit

240: 링크 텍스트 추출부 300: 사용자 의도 분석부240: link text extraction unit 300: user intention analysis unit

310: 키워드 추출부 320: 문법 생성부310: keyword extraction unit 320: grammar generation unit

500: 음성 인식부 510: 웹 명령 생성부500: speech recognition unit 510: web command generation unit

Claims

(a) analyzing the user's web history from the user's web search logs and generating a voice access list based on the analysis result;

(b) generating a speech recognition grammar reflecting the generated speech access list; And

(c) generating a web command by matching the input voice of the user with the generated speech recognition grammar.

The method of claim 1,

The voice connection list,

Voice web providing method in a mobile terminal, characterized in that reflecting the user visited site list generated by analyzing the number of visits and the distribution of the visit time of the web site of the user.

The method of claim 2,

The voice connection list,

A default site list stored in the mobile terminal of the user,

The method of providing a voice web in a mobile terminal, characterized by reflecting a bookmark list obtained from information registered as a favorite.

The method of claim 1,

In step (a),

Classifying the field of the site from the URL of the accessed site;

Analyzing an html source of web contents corresponding to the accessed site; And

Extracting the web navigation and web searchable link texts from the accessed site from the analyzed result;

In step (b),

And analyzing the intention of the voice input of the user by using the site classification result and the extracted link text, and generating a speech recognition grammar according to the intention of the user.

The method of claim 4, wherein

In step (b),

And a keyword is extracted from the generated voice access list and the extracted link text, and a speech recognition grammar is generated from the extracted keyword.

The method of claim 4, wherein

In step (b),

The intention analysis of the user,

According to the mobile terminal of the mobile terminal characterized in that at least one or more of the user's intention to analyze the direct site access, web navigation, general web search and specific web search according to whether or not the web browser of the mobile terminal is running; How to deliver voice web.

A recording medium having recorded thereon a program for executing a method according to any one of claims 1 to 6 on a computer.

A content data manager which analyzes the user's web history from the user's web search logs and generates a voice access list based on the analysis result;

A dynamic grammar generator for generating a speech recognition grammar reflecting the generated voice access list; And

And a speech interpreter configured to generate a web command by matching the input speech of the user with the generated speech recognition grammar.

The method of claim 8,

The content data management unit,

A web search log analysis unit analyzing the number of visits to the web site and the distribution of visit times by the user; And

And a voice access list generator for generating the voice access list by using the result analyzed by the web search log analyzer.

The method of claim 9,

The voice connection list generation unit,

A default site list stored in the mobile terminal of the user,

An apparatus for providing a voice web in a mobile terminal, characterized by generating a voice access list reflecting a bookmark list obtained from information registered as a favorite.

The method of claim 8,

The content data management unit,

A website classification unit classifying a field of a site from a URL of a site to which the user's mobile terminal is connected;

A web content analyzer configured to analyze an html source of web contents corresponding to the accessed site; And

And a link text extracting unit for extracting web navigation and web searchable link texts from the accessed site from the analyzed result.

The dynamic grammar generation unit,

A user intention analyzer configured to analyze an intention of a voice input of the user by using the site classification result and the extracted link text; And

And a grammar generator for generating a speech recognition grammar according to the user's intention.

The method of claim 11, wherein

The dynamic grammar generation unit,

And a keyword extracting unit which extracts a keyword from the generated voice access list and the extracted link text.

The grammar generation unit,

Speech web providing apparatus in a mobile terminal, characterized in that for generating a speech recognition grammar from the extracted keywords.

The method of claim 11, wherein

The intention analysis unit of the user,

According to the mobile terminal of the mobile terminal characterized in that at least one or more of the user's intention to analyze the direct site access, web navigation, general web search and specific web search according to whether or not the web browser of the mobile terminal is running; Voice web providing device.