KR20060096654A

KR20060096654A - Mobile service system using multi-modal platform and method thereof

Info

Publication number: KR20060096654A
Application number: KR1020050017334A
Authority: KR
Inventors: 김경민; 채상호
Original assignee: 에스케이 텔레콤주식회사
Priority date: 2005-03-02
Filing date: 2005-03-02
Publication date: 2006-09-13
Also published as: KR100702789B1

Abstract

본 발명은 특정 특정 모바일 커뮤니티 서비스에 적용하여 음성 및 문자정보를 융합하여 제공할 수 있는 멀티모달 플랫폼을 이용한 모바일 서비스 시스템 및 그 방법이 개시되어 있다. 상기 시스템은, 왑/웹 브라우저를 통해 인터넷에 접속하고, 음성 데이터와 현재 URL(Uniform Resource Locator)을 전송하는 이동통신 단말; 상기 이동통신 단말에서 전송된 현재 URL을 이용하여 음성인식 문법을 매핑하고 이동할 새로운 페이지의 타겟 URL을 생성하는 멀티모달 플랫폼; 상기 멀티모달 플랫폼로부터 전송된 음성 데이터와 음성인식 문법을 이용하여 음성을 인식하는 ASR(Automatic Speech Recognition) 서버; 가입자의 음성 요청에 따라 상기 멀티모달 플랫폼에 의해 왑 이나 멀티모달 커뮤니티 어플리케이션의 컨텐츠를 등록하고 그에 따른 결과를 수신하여 상기이동통신 단말로 전달하는 모바일 커뮤니티 서버; 및 상기 모바일 커뮤니티 서버로부터 가입자의 음성에 따른 음성 컨텐츠 등록요청을 수신하고, 그 결과를 리턴하는 웹서버로 구성되어 있다.The present invention discloses a mobile service system and a method using a multi-modal platform that can be applied to a specific specific mobile community service to converge and provide voice and text information. The system includes a mobile communication terminal for accessing the Internet through a swap / web browser and transmitting voice data and a current Uniform Resource Locator (URL); A multi-modal platform for mapping a speech recognition grammar using a current URL transmitted from the mobile communication terminal and generating a target URL of a new page to be moved; An Automatic Speech Recognition (ASR) server for recognizing speech using speech data and speech recognition grammar transmitted from the multi-modal platform; A mobile community server that registers contents of a swap or multimodal community application by the multimodal platform according to a voice request of a subscriber and receives the result and delivers the result to the mobile communication terminal; And a web server which receives a voice content registration request according to the voice of the subscriber from the mobile community server and returns the result.

멀티모달, 싸이월드, 커뮤니티, 모바일, 음성인식 Multi-Modal, Cyworld, Community, Mobile, Voice Recognition

Description

Mobile service system using multi-modal platform and method thereof {Mobile Service System Using Multi-Modal Platform And Method Thereof}

본 명세서에서 첨부되는 다음의 도면 들은 본 발명의 바람직한 실시예를 예시하는 것이며, 후술하는 발명의 상세한 설명과 함께 본 발명의 기술사상을 더욱 이해시키는 역할을 하는 것이므로, 본 발명은 그러한 도면에 기재된 사항에만 한정되어 해석되어서는 아니 된다.The following drawings, which are attached in this specification, illustrate exemplary embodiments of the present invention, and together with the detailed description of the present invention, serve to further understand the technical spirit of the present invention. It should not be construed as limited to.

도 1은 본 발명에 따른 멀티모달 플랫폼을 이용한 모바일 서비스 시스템의 개략적인 구성도이다.1 is a schematic configuration diagram of a mobile service system using a multi-modal platform according to the present invention.

도 2는 이동통신 단말의 구성을 나타낸 블록도이다.2 is a block diagram showing the configuration of a mobile communication terminal.

도 3은 도 2의 음성인식을 위한 서버의 구성을 나타낸 블록도이다.3 is a block diagram showing the configuration of a server for voice recognition of FIG.

도 4는 본 발명에 따른 멀티모달 플랫폼을 이용한 모바일 서비스 방법을 설명하기 위한 동작 흐름도이다.4 is a flowchart illustrating a mobile service method using a multi-modal platform according to the present invention.

도 5는 본 발명의 다른 실시예에 따른 모바일 서비스 방법을 설명하기 위한 동작 흐름도이다.5 is a flowchart illustrating a mobile service method according to another embodiment of the present invention.

< 도면의 주요 부분에 대한 부호 설명><Description of Signs for Main Parts of Drawings>

10: 이동통신 단말기 11: 마이크10: mobile terminal 11: microphone

12: EVRC 엔코더 13: 멀티모달 모듈12: EVRC Encoder 13: Multimodal Module

14: 왑/웹 브라우저 15: 무선모듈14: swap / web browser 15: wireless module

20: 교환기 30: 멀티모달 플랫폼 20: exchange 30: multimodal platform

31: 네트웍 연결부 32: 음성 데이터 변환부31: network connection unit 32: voice data conversion unit

33: 음성인식 문법 매핑부 34: 전역 문법 매핑부33: speech recognition grammar mapping unit 34: global grammar mapping unit

35: 타겟 URL 생성부 36: 데이터 베이스35: target URL generator 36: database

40: ASR(Automatic Speech Recognition) 서버40: Automatic Speech Recognition (ASR) Server

50: 모바일 커뮤니티 서버 60: 웹서버50: mobile community server 60: web server

본 발명은 멀티모달 플랫폼을 이용한 무선 왑 서비스에 관한 것으로서, 보다 상세하게는 멀티모달 기술을 특정 모바일 커뮤니티 서비스에 적용하여 음성 및 문자정보를 융합하여 제공할 수 있는 멀티모달 플랫폼을 이용한 모바일 서비스 시스템 및 그 방법에 관한 것이다.The present invention relates to a wireless swap service using a multi-modal platform, and more particularly, a mobile service system using a multi-modal platform that can provide multi-modal technology to a specific mobile community service and provide voice and text information. It's about how.

이동통신 단말기가 보급되면서 이동통신 단말기에서도 인터넷 상의 정보를 이용하고자 하는 필요성이 증가하고 있으며, 이러한 이동통신 단말기는 하드웨어의 성능, 통신망의 속도, 화면의 크기, 입력장치 등이 PC와 비교하여 그With the spread of mobile communication terminals, the necessity of using information on the Internet is increasing in mobile communication terminals, and the performance of hardware, speed of communication network, screen size, input device, etc. of mobile communication terminals are higher than those of PCs.

성능과 제약조건이 현저한 차이가 있어서 종래의 PC와 유선 인터넷 망을 대상으로 하는 인 터넷 브라우저와 콘텐츠를 그대로 이용하기에는 문제점이 많이 있었다. 그러나, 이러한 문제점은 WAP 등과 같이 새로운 콘텐츠 형식과 입력방식(예를 들면 숫자 버튼을 이용한 네비게이션)을 통해 해결되었으나, 이동통신 단말기의 작은 화면에서 동시에 많은 메뉴와 링크를 보여주는데 한계가 있으며, 콘텐츠는 여러 단계로 이루어진 트리(Tree)형식의 계층으로 구성되어 초기에 설정된 웹 페이지의 메뉴에서 사용자가 원하는 자료를 구비한 웹 페이지로 이동하기 위해서는 연결된 링크들을 따라 특정키를 연속해서 입력해야 최종 콘텐츠에 도달할 수 있다. 따라서, URL 등의 문자의 입력과 트리를 따라 연속적으로 선택하기 위하여 특수 기능키를 클릭해야 함으로 접속에 필요한 시간이 늘어나는 문제점이 있다.Due to the significant difference in performance and constraints, there were many problems in using the Internet browser and contents for the conventional PC and the wired Internet network. However, this problem has been solved through new content formats and input methods (eg, navigation using numeric buttons), such as WAP, but there are limitations in showing many menus and links at the same time on a small screen of a mobile communication terminal. In order to move from the menu of the initially set web page to the web page with the data desired by the user, it is necessary to continuously input specific keys along the linked links to reach the final content. Can be. Therefore, there is a problem in that the time required for connection is increased because a special function key must be clicked in order to continuously select characters along the tree and input of characters such as URLs.

따라서, 음성 인식기가 내장된 이동통신 단말기를 통해 상기한 바와 같은 접속에 따른 불편함을 해결하는 시도가 있었다. Accordingly, there has been an attempt to solve the inconvenience caused by the connection as described above through the mobile communication terminal with a built-in voice recognizer.

그러나, 음성인식이 가능한 이동통신 단말기는 별도의 음성인식 모듈을 구비해야 하며, 이러한 음성인식이 가능한 이동통신 단말기는 내장된 메모리의 음성 명령어와 동일한 음성 데이터 만을 처리할 수 있으므로, 상기 이동통신 단말기의 저장수단의 자원을 소비하고 별도의 모듈을 구비해야 하기 때문에 이동통신 단말기 내부의 공간을 소비하게 되는 문제점이 있다. 그러므로, Voice XML 과 음성인식 기술을 사용하여 이동통신 단말기의 좁은 화면과 한정된 키 입력의 제약으로부터 벗어나 명령어나 음성을 통해 서비스를 제공하고 있다.However, the mobile terminal capable of voice recognition should have a separate voice recognition module, and the mobile terminal capable of voice recognition can process only the same voice data as the voice command of the built-in memory. Since there is a need to consume the resources of the storage means and have a separate module, there is a problem in that the space inside the mobile communication terminal is consumed. Therefore, voice XML and voice recognition technology are used to provide services through commands or voice, without the limitation of narrow screen and limited key input of mobile communication terminal.

한편, 유선에서 활성화 되어 있는 1 인 미디어 및 커뮤니티 서비스(예를 들면, 네이트의 싸이월드, 네이버의 블로그 등)가 모바일에 적용되고 있다. 따라서, 가입자는 모바일의 왑을 이용해 자신의 미니홈피를 열람하고 포스팅할 수 있으며, 타인의 미니홈피도 조회할 수 있다. 모바일에 적용된 이러한 커뮤니티 서비스는 WAP 기반의 서비스와 휴대폰 어플리케이션 형태의 서비스가 있을 수 있다.Meanwhile, one-person media and community services (eg, Nate's Cyworld, Naver's blog, etc.), which are activated by wire, are being applied to mobile. Therefore, the subscriber can view and post his own minihompy using the mobile swap, and can also query the minihompy of others. Such community service applied to mobile may be a WAP-based service and a mobile phone application.

상기 모바일 기반 커뮤니티 서비스에 멀티모달 기술을 적용하여 음성과 문자정보를 융합할 수 있는 서비스에 대한 연구가 진행중이다. Research on a service that can fuse voice and text information by applying a multi-modal technology to the mobile-based community service is in progress.

따라서, 상기한 종래 기술의 제반 문제점을 해결하기 위하여 안출된 것으로서, 본 발명의 목적은 멀티모달 기술을 특정 모바일 커뮤니티 서비스에 적용하여 음성 및 문자정보를 융합하여 제공함으로써, 자판을 통해 불편하게 정보를 포스팅하지 않고, 간단하게 음성으로 포스팅할 수 있어 불편한 입력 제한을 뛰어넘을 수 있는 멀티모달 플랫폼을 이용한 모바일 서비스 시스템 및 그 방법을 제공하는데 있다.Accordingly, an object of the present invention is to solve the above problems of the prior art, an object of the present invention by applying a multi-modal technology to a specific mobile community service by providing a converged voice and text information, uncomfortable information through the keyboard The present invention provides a mobile service system and a method using a multi-modal platform that can easily post by voice and overcome an inconvenient input limit without posting.

상기 목적을 달성하기 위한 제 1 관점에 따른 본 발명의 멀티모달 플랫폼을 이용한 모바일 서비스 시스템은, 왑/웹 브라우저를 통해 인터넷에 접속하고, 음성 데이터와 현재 URL(Uniform Resource Locator)을 전송하는 이동통신 단말; 상기 이동통신 단말에서 전송된 현재 URL을 이용하여 음성인식 문법을 매핑하고 이동할 새로운 페이지의 타겟 URL을 생성하는 멀티모달 플랫폼; 상기 멀티모달 플랫폼로부터 전송된 음성 데이터와 음성인식 문법을 이용하여 음성을 인식하는 ASR(Automatic Speech Recognition) 서버; 가입자의 음성 요청에 따라 상기 멀티모달 플랫폼에 의해 왑 이나 멀티모달 커뮤니티 어플리케이션의 컨텐츠를 등록하고 그에 따른 결과를 수신하여 상기이동통신 단말로 전달하는 모바일 커뮤니티 서버; 및 상기 모바일 커뮤니티 서버로부터 가입자의 음성에 따른 음성 컨텐츠 등록요청을 수신하고, 그 결과를 리턴하는 웹서버를 포함하는 것을 특징으로 한다.Mobile service system using a multi-modal platform of the present invention according to the first aspect for achieving the above object, the mobile communication to access the Internet through a swap / web browser, and to transmit voice data and the current Uniform Resource Locator (URL) Terminal; A multi-modal platform for mapping a speech recognition grammar using a current URL transmitted from the mobile communication terminal and generating a target URL of a new page to be moved; An Automatic Speech Recognition (ASR) server for recognizing speech using speech data and speech recognition grammar transmitted from the multi-modal platform; A mobile community server that registers contents of a swap or multimodal community application by the multimodal platform according to a voice request of a subscriber and receives the result and delivers the result to the mobile communication terminal; And a web server which receives a voice content registration request according to the voice of the subscriber from the mobile community server and returns the result.

상기 목적을 달성하기 위한 제 2 관점에 따른 본 발명의 멀티모달 플랫폼을 이용한 모바일 서비스 시스템은, 무선 네트워크를 통해 인터넷에 접속하는 브라우저, 마이크로 입력되는 음성 데이터를 변환하는 엔코더, 및 상기 브라우저로부터 현재 접속중인 사이트의 URL 정보와, 상기 엔코더에서 변환된 음성 데이터와, 이동통신 단말기의 정보를 멀티모달 서버로 전송하는 멀티모달 모듈을 구비한 이동통신 단말; 상기 이동통신 단말에서 전송된 현재의 URL 정보로부터 음성인식에 필요한 문법을 결정하고, 상기 음성인식 문법을 상기 음성 데이터와 함께 ASR 서버로 전송하며, 상기 음성인식 서버에서 인식된 결과를 통해 접속할 타겟 URL을 생성하여 상기 이동통신 단말로 전송하는 멀티모달 플랫폼; 상기 멀티모달 플랫폼에서 전송된 음성 데이터와 음성인식 문법을 이용하여 음성을 인식하고, 인식된 결과를 상기 멀티모달 플랫폼으로 전송하는 ASR 서버; 및 상기 멀티모달 플랫폼으로부터 전송된 타겟 URL을 이용하여 상기 이동통신 단말이 접속하는 웹 서버를 포함한다.According to a second aspect of the present invention, there is provided a mobile service system using a multimodal platform of the present invention, a browser accessing the Internet through a wireless network, an encoder converting voice data input into a microphone, and a current connection from the browser. A mobile communication terminal having a multi-modal module for transmitting the URL information of the site, the voice data converted by the encoder, and the information of the mobile communication terminal to a multi-modal server; A target URL to determine a grammar required for voice recognition from the current URL information transmitted from the mobile communication terminal, transmit the voice recognition grammar to the ASR server together with the voice data, and access the target URL through a result recognized by the voice recognition server Generating a multi-modal platform for transmitting to the mobile communication terminal; An ASR server for recognizing speech using speech data and speech recognition grammar transmitted from the multi-modal platform, and transmitting the recognized result to the multi-modal platform; And a web server to which the mobile communication terminal accesses using a target URL transmitted from the multi-modal platform.

또한, 본 발명에 따른 멀티모달 플랫폼을 이용한 모바일 서비스 방법은, A) 이동통신 단말의 제어부가, 특정키에 의한 왑/웹 브라우저가 실행 되었는지를 판단하고, 상기 왑/웹 브라우저가 실행된 경우, 단말의 마이크로 입력되는 음성을 엔코더에서 변환하여 저장하고, 상기 저장된 정보를 초기 설정된 페이지의 URL과 함께 전송하는 단계; B) 상기 멀티모달 플랫폼이, 상기 이동통신 단말로부터 전송된 현재 페이지의 URL을 이용하여 음성인식에 필요한 음성인식 문법을 생성하고, 생성된 음성인식 문법을 음성 데이터와 함께 ASR 서버로 전송하는 단계; C) 상기 음성인식 서버가, 상기 전송된 음성 데이터와 음성인식 문법을 이용하여 전송된 음성을 인식하고, 인식된 결과를 상기 멀티모달 플랫폼으로 전송하는 단계; D) 상기 모바일 커뮤니티 서버가, 상기 멀티모달 플랫폼으로부터 가입자의 녹음된 음성 메지시 저장 요청 메시지를 수신한 후, 상기 요청 메시지에 따른 컨텐츠 등록요청을 웹서버로 전달하는 단계; 및 E) 상기 웹서버가, 상기 컨텐츠 등록요청에 따른 응답 메시지를 상기 모바일 커뮤니티 서버 및 상기 멀티모달 플랫폼을 통해 상기 이동통신 단말로 전송하는 단계를 포함한다.In addition, in the mobile service method using a multi-modal platform according to the present invention, A) the controller of the mobile communication terminal determines whether a swap / web browser by a specific key is executed, and when the swap / web browser is executed, Converting and storing a voice input into a microphone of the terminal in an encoder and transmitting the stored information together with a URL of an initially set page; B) generating, by the multi-modal platform, a speech recognition grammar required for speech recognition using the URL of the current page transmitted from the mobile communication terminal, and transmitting the generated speech recognition grammar to the ASR server along with the speech data; C) the voice recognition server, recognizing the transmitted voice using the transmitted voice data and the voice recognition grammar, and transmits the recognized result to the multi-modal platform; D) the mobile community server, after receiving the subscriber's recorded voice message storage request message from the multi-modal platform, forwarding the content registration request according to the request message to a web server; And E) transmitting, by the web server, a response message according to the content registration request to the mobile communication terminal through the mobile community server and the multi-modal platform.

따라서, 본 발명에 의하면, 멀티모달 기술을 특정 모바일 커뮤니티 서비스에 적용하여 음성 및 문자정보를 융합하여 제공함으로써, 음성인식을 기반으로 한 타인의 미니홈피 이동이나, 음성 커맨드 등을 활용하여 편리하게 어플리케이션을 사용할 수 있게 해준다.Therefore, according to the present invention, by applying a multi-modal technology to a specific mobile community service by providing a converged voice and text information, the application can be conveniently utilized by using the movement of another person's mini homepage or voice command based on voice recognition. Enable to use

상기 첨부 도면의 구성요소 들에 참조부호를 부가함에 있어서는 동일한 구성요소들에 한해서 비록 다른 도면 상에 표시 되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의하여야 한다. 또한, 하기 설명 및 첨부 도면에서 구체적인 처리 흐름과 같은 많은 특정 상세 들이 본 발명의 보다 전반적인 이해를 제공하기 위해 나타나 있다. 이들 특정 상세 들 없이 본 발명이 실시될 수 있다는 것은 이 기술분야에서 통상의 지식을 가진 자에게 자명할 것이다. 그리고, 본 발명의 요지를 불필요하게 흐릴 수 있는 공지 기능 및 구성에 대한 상세한 설명은 생략한다. In the reference numerals to the components of the accompanying drawings, it should be noted that the same reference numerals have the same reference numerals as much as possible even if displayed on different drawings. In addition, many specific details are set forth in the following description and in the accompanying drawings, in order to provide a more thorough understanding of the present invention. It will be apparent to those skilled in the art that the present invention may be practiced without these specific details. In addition, detailed description of well-known functions and configurations that may unnecessarily obscure the subject matter of the present invention will be omitted.

본 발명은 모바일 기반 커뮤니티 어플리케이션에 멀티모달 기술을 적용하고 자 하는 것으로서, 멀티모달 기술을 적용하게 되면 자판을 통해 불편하게 정보를 포스팅하지 않고 간단하게 음성으로 포스팅할 수 있어 불편한 입력 제한을 뛰어넘을 수 있고, 음성인식을 기반으로 한 타인의 미니홈피 이동이나 음성 명령 등을 활용하여 편리하게 어플리케이션을 사용할 수 있게 해준다. The present invention is to apply a multi-modal technology to a mobile-based community application, if you apply the multi-modal technology can easily post to the voice without posting information uncomfortable through the keyboard can exceed the uncomfortable input limit In addition, it makes it possible to use the application conveniently by using other people's homepage or voice command based on voice recognition.

이하, 첨부된 도 1 및 도 5에 의거하여 본 발명의 바람직한 실시예를 보다 상세하게 설명하면 다음과 같다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to FIGS. 1 and 5 as follows.

도 1은 본 발명에 따른 멀티모달 플랫폼을 이용한 모바일 서비스 시스템의 구성을 보인 도이다.1 is a diagram showing the configuration of a mobile service system using a multi-modal platform according to the present invention.

이에 도시된 바와 같이, 상기 시스템은, 왑/웹 브라우저(14)를 통해 인터넷에 접속하고, 음성 데이터와 현재 URL(Uniform Resource Locator)을 전송하는 이동통신 단말(10)과, 교환기(20)와, 상기 이동통신 단말(10)에서 전송된 현재 URL을 이용하여 음성인식 문법을 매핑하고 이동할 새로운 페이지의 타겟 URL을 생성하는 멀티모달 플랫폼(30)과, 상기 멀티모달 플랫폼(30)로부터 전송된 음성 데이터와 음성인식 문법을 이용하여 음성을 인식하는 ASR(Automatic Speech Recognition) 서버(40)와, 가입자의 음성 요청에 따라 상기 멀티모달 플랫폼(30)에 의해 왑 이나 멀티모달 커뮤니티 어플리케이션의 컨텐츠를 후술할 웹서버(60)에 등록하고 그에 따른 결과를 수신하여 상기 이동통신 단말(10)로 전달하는 모바일 커뮤니티 서버(50)와, 그리고 상기 모바일 커뮤니티 서버(50)로부터 가입자의 음성에 따른 음성 컨텐츠 등록요청을 수신하고, 그 결과를 리턴하는 웹서버(60)로 구성되어 있다. As shown therein, the system is connected to the Internet through a swap / web browser 14, the mobile communication terminal 10 for transmitting voice data and the current Uniform Resource Locator (URL), the switch 20 and And a multimodal platform 30 for mapping a speech recognition grammar using a current URL transmitted from the mobile communication terminal 10 and generating a target URL of a new page to be moved, and a voice transmitted from the multimodal platform 30. ASR (Automatic Speech Recognition) server 40 for recognizing speech using data and speech recognition grammar, and the content of the swap or multi-modal community application will be described later by the multi-modal platform 30 according to a subscriber's voice request. The mobile community server 50 registers with the web server 60 and receives the result and transmits the result to the mobile communication terminal 10, and the subscriber from the mobile community server 50. And a web server 60 for receiving the voice content registration request according to the voice and returning the result.

도 2는 이동통신 단말의 구성을 나타낸 블록도로서, 상기 이동통신 단말(10) 은 통화부(미도시), 마이크(11), EVRC 엔코더(12), 멀티모달 모듈(13), 왑/웹 브라우저(14), 및 무선모듈(15)로 구성된다.2 is a block diagram showing the configuration of a mobile communication terminal. The mobile communication terminal 10 includes a call unit (not shown), a microphone 11, an EVRC encoder 12, a multi-modal module 13, and a swap / web. A browser 14 and a wireless module 15.

상기 이동통신 단말(10)은 멀티모달 어플리케이션을 내장하며, 가입자의 미니홈피 조회 및 음성/문자 포스팅, 음성/문자 보기 기능 등을 제공하게 된다.The mobile communication terminal 10 has a built-in multi-modal application, and provides the subscriber's mini homepage inquiry and voice / text posting, voice / text viewing function.

상기 엔코더(12)는 상기 마이크(11)를 통해 입력된 음성을 압축 변환하며, 상기 엔코더(12)는 8 Kbps EVRC 또는 13Kbps QCELP 이고, 이동통신 단말기의 종류에 따라 다른 종류의 엔코더를 사용할 수 있으며, 바람직하게 상기 엔코더(120)는 8 Kbps EVRC 엔코더이다.The encoder 12 compresses and converts the voice input through the microphone 11, and the encoder 12 is 8 Kbps EVRC or 13 Kbps QCELP, and different types of encoders may be used according to the type of mobile communication terminal. Preferably, the encoder 120 is an 8 Kbps EVRC encoder.

상기 멀티모달 모듈(13)은 상기 엔코더(12)에서 압축 변환된 음성 데이터와, 상기 브라우저(14)로부터 검출한 현재 페이지의 URL 정보와, 단말기 정보(예를 들면, 브라우저의 종류, 이동통신 전화번호 등)를 무선모듈(15)을 통해 음성인식 시스템(미도시)으로 전송한다.The multi-modal module 13 includes voice data compressed and converted by the encoder 12, URL information of the current page detected by the browser 14, and terminal information (e.g., type of browser, mobile communication phone). Number, etc.) is transmitted to the voice recognition system (not shown) through the wireless module 15.

음성인식을 위한 서버는, 음성인식을 위하여 고유의 음성인식 문법과 인식결과에 따라 새로 이동할 타겟 URL을 생성하는 멀티모달 플랫폼(30)과 상기 멀티모달 플랫폼(30)으로부터 전송된 음성 데이터와, 음성인식 문법을 이용하여 음성을 인식하고, 인식결과를 상기 멀티모달 플랫폼(30)으로 전송하는 ASR(Automatic Speech Recognition) 서버(40)로 구성된다.The server for speech recognition includes a multimodal platform 30 for generating a target URL to be newly moved according to a unique speech recognition grammar and a recognition result, voice data transmitted from the multimodal platform 30, and voice. It is composed of an Automatic Speech Recognition (ASR) server 40 for recognizing speech using a recognition grammar and transmitting the recognition result to the multi-modal platform 30.

상기 멀티모달 플랫폼(30)은 네트웍 연결부(31)와, 음성 데이터 변환부(32)와, 음성인식 문법 매핑부(33)와, 전역 문법 매핑부(34)와, 타겟 URL 생성부(35) 와, 데이터 베이스(36)로 구성된다.The multimodal platform 30 includes a network connection unit 31, a voice data conversion unit 32, a speech recognition grammar mapping unit 33, a global grammar mapping unit 34, and a target URL generation unit 35. And a database 36.

상기 네트웍 연결부(31)는 이동통신 단말(미도시)과 접속하여 데이터를 송수신하며, 바람직하게 TCP/IP 프로토콜을 이용하여 접속한다.The network connection unit 31 connects to a mobile communication terminal (not shown), transmits and receives data, and preferably connects using a TCP / IP protocol.

상기 음성 데이터 변환부(32)는 상기 네트웍 연결부(31)와 연결되며, 상기 네트웍 연결부(31)에서 전송된 음성 데이터를 음성인식 엔진이 처리할 수 있도록 변환하며, 바람직하게 PCM 형식으로 변환되고, 음성인식 엔진이 이동통신 단말에서 전송한 압축 포맷(예를 들면, EVRC 포맷)을 직접 처리할 수 있는 경우 상기 변환과정은 생략이 가능하다.The voice data conversion unit 32 is connected to the network connection unit 31, and converts the voice data transmitted from the network connection unit 31 so that the voice recognition engine can process, preferably converted to PCM format, If the speech recognition engine can directly process the compression format (for example, EVRC format) transmitted from the mobile communication terminal, the conversion process can be omitted.

상기 음성인식 문법 매핑부(33)는 상기 네트웍 연결부(31)에서 전송된 URL로부터 그 페이지에 유효한 고유 음성인식 문법을 검출하여 상기 전송된 URL과 매핑하는 것으로서, 소정의 음성명령에 대하여 현재 페이지에 관련하여 발The speech recognition grammar mapping unit 33 detects a valid speech recognition grammar valid for the page from the URL transmitted from the network connection unit 31 and maps the generated speech recognition grammar to the current page for a predetermined voice command. In relation to foot

생 되어야 할 조건을 매핑한다. 즉, 음성인식 문법은 소정의 URL에 대하여 발생 되어야 할 소정의 음성명령을 명시하며, 상기 URL에서 입력 가능한 음성명령 리스트가 음성인식 문법이 된다. 예를 들면, 1 인 미디어 및 커뮤니티 서비스를 출력하는 페이지에서 회사의 이름(예를 들면, 네이트, 네이버 등)을 음성 입력하면, 상기 서비스와 관련된 회사이름의 리스트가 음성인식 문법이 되어, 상기 음성인식 문법과 입력된 URL(예를들면, 싸이월드(http://cyworld.nate.com), 블로그(http://blog.naver.com) 등)이 매핑된다.Map the conditions to be generated. That is, the voice recognition grammar specifies a predetermined voice command to be generated for a predetermined URL, and the list of voice commands input from the URL becomes the voice recognition grammar. For example, if a voice name of a company (for example, Nate, Naver, etc.) is inputted on a page for outputting one-person media and community service, the list of company names related to the service becomes voice recognition grammar, and the voice The recognition grammar and the input URL (eg Cyworld (http://cyworld.nate.com), blog (http://blog.naver.com), etc.) are mapped.

상기 전역문법 매핑부(34)는 상기 음성인식 매핑부(33)에서 결정된 고유한 음성인식 문법과, 브라우저(미도시)의 현재 페이지에 관계없이 공통적으로 유효한 음성 명령(예를 들면, 도움말, 북마크 등)을 나타내는 음성인식 문법과 매핑한다.The global grammar mapping unit 34 may have a unique voice recognition grammar determined by the voice recognition mapping unit 33 and a voice command that is commonly valid regardless of the current page of the browser (eg, help, bookmark). Maps to a speech recognition grammar.

상기 타겟 URL 생성부(35)는 상기 네트웍 연결부(31)와 상기 ASR 서버(40)와 연결되며 전송된 단말기 정보와 음성인식 결과에 따른 명령어를 이용하여 상기 명령어가 지정하는 URL을 이동할 타겟 URL로 생성하여 상기 네트웍 연결부(31)로 전송한다. 예를 들면, '싸이월드' 사이트에서의 '음성 포스팅' 또는 '타인이 남긴 문자 및 음성 메시지 리스트의 선택'을 음성 인식한 경우 상기 싸이월드 사이트의 하위 사이트 중에서 그에 상응하는 페이지를 볼 수 있는 URL을 타겟 URL로 설정한다.The target URL generation unit 35 is connected to the network connection unit 31 and the ASR server 40 as a target URL to move the URL designated by the command using a command according to the transmitted terminal information and the voice recognition result. It generates and transmits to the network connection unit 31. For example, if the voice recognition of 'voice posting' on 'cyworld' or 'selection of text and voice messages left by others' is performed, the URL of the corresponding subworld of cyworld site can be viewed. Is set to the target URL.

상기 ASR(Automatic Speech Recognition) 서버(40)는 상기 멀티모달 플랫폼(30)와 연결되며, 상기 멀티모달 플랫폼(30)에서 전송된 음성 데이터와 음성인식 문법을 이용하여 음성을 인식하며, 본 발명에서는 공지된 음성인식 시스템을 사용한다.The ASR (Automatic Speech Recognition) server 40 is connected to the multi-modal platform 30, recognizes the speech using the speech data and the speech recognition grammar transmitted from the multi-modal platform 30, in the present invention Uses known speech recognition systems.

또한, 상기 ASR(Automatic Speech Recognition) 서버(40)는, 현재 표시되는 메뉴항목을 읽는 단순 음성명령과 여러 단계의 메뉴 트리를 가로지르는 단축 음성명령으로 구분하여 음성을 인식하게 되는데, 단순 음성명령의 경우에는, 사용자가 화면의 메뉴항목을 읽음으로써 이루어지며, 복합단어로 구성된 메뉴항목의 경우 여러 가지 대체 레이블(Alias)를 고려해야 한다. 예를 들어 "타인이 남긴 음성 메시지 확인"와 같은 메뉴는 사용자가 "타인의 음성 메시지"와 같이 줄여서 한 단어로 말할 수도 있기 때문에, 사용자의 편의를 위한 여러 가지대체 레이블(Alias)이 문법(Grammar)에 추가된다.In addition, the ASR (Automatic Speech Recognition) server 40, the voice is divided into a simple voice command to read the currently displayed menu item and a short voice command crossing the menu tree of several levels, the voice of the simple voice command In this case, the user reads a menu item on the screen, and in case of a compound word menu item, various alternative labels should be considered. For example, a menu such as "Check for voice messages left by others" may allow users to speak short words, such as "Voices of other people." Is added to).

한편, 단축 음성명령의 경우에는, 단말의 애플리케이션을 사용하는 사용자의 발성패턴(Corpus)을 수집하여 메뉴단계를 거치지 않고, 한번에 명령을 내리는 기능이다. 예를들어 "타인이 남긴 음성 메시지 리스트의 선택"과 같은 발성에 대해서 "음성 메시지" 선택과 더불어 "리스트 선택"이라는 구체적인 문구까지 설정 해주어 사용자의 편의를 제공한다. 상기 단축 음성명령을 지원하기 위해서는 상기 ASR(Automatic Speech Recognition) 서버(40)가 연속어 인식이 가능해야 하며, 문법(Grammar)이 ABNF(Augmented Backus-Naur Form) 형식이나 이와 동등한 형식의 문법구조를 수용할 수 있도록 구현된다.On the other hand, in the case of a short voice command, the voice pattern (Corpus) of the user who uses the application of the terminal is collected and the command is issued at once without going through a menu step. For example, for voices such as "selection of voice message list left by others", a specific phrase "list selection" is set as well as "voice message" for user convenience. In order to support the shortened voice command, the ASR server 40 should be capable of recognizing continuous words, and the grammar may include a grammatical structure of an Augmented Backus-Naur Form (ABNF) form or an equivalent form. Implemented to accommodate.

도 4는 도 2는 본 발명에 따른 멀티모달 플랫폼을 이용한 모바일 서비스 방법을 설명하기 위한 동작 흐름도로서, 가입자의 음성 포스팅 과정을 설명하기 위한 것이다.FIG. 4 is a flowchart illustrating a mobile service method using a multi-modal platform according to the present invention, illustrating a voice posting process of a subscriber.

이하의 실시예에서는, 가입자가 단말의 왑/웹 브라우저(14)를 이용하여 음성 명령을 수행함으로써, 1 인 미디어 및 커뮤니티 서비스를 출력하는 페이지의URL(예를들면, 싸이월드(http://cyworld.nate.com), 블로그(http://blog.naver.com) 등)로 이동하고, 그 URL에서 '자신의 음성 포스팅' 또는 '타인이 남긴 문자 및 음성 메시지 리스트의 선택을 통하여 확인하기'를 수행하는 과정에 대하여 설명한다.In the following embodiment, the subscriber performs a voice command using the terminal's swap / web browser 14, whereby the URL of a page for outputting a single media and community service (e.g., Cyworld (http: // cyworld.nate.com), blogs (http://blog.naver.com), etc.), and check the URL by selecting 'your own voice posting' or 'list of text and voice messages left by others'. Explain the process of performing '.

먼저, 상기 이동통신 단말(10)의 제어부(미도시)는 특정키에 의한 왑/웹 브라우저(14)가 실행 되었는지를 판단하고(S40), 상기 왑/웹 브라우저(14)가 실행되지 않은 경우, 이동통신 단말기의 일반적인 기능을 수행하고(S41), 상기 왑/웹 브라우저(14)가 실행된 경우, 단말의 마이크로 입력되는 음성을 엔코더에서 변환하여 저장하고, 상기 저장된 정보를 초기 설정된 페이지의 URL(예를들면, 싸이월드 (http://cyworld.nate.com), 블로그(http://blog.naver.com))과 함께 멀티모달 플랫폼(30)으로 전송한다(S42).First, the controller (not shown) of the mobile communication terminal 10 determines whether the swap / web browser 14 is executed by a specific key (S40), and when the swap / web browser 14 is not executed. When performing the general function of the mobile communication terminal (S41) and the swap / web browser 14 is executed, the encoder converts and stores the voice input into the terminal's microphone, and stores the stored information in the URL of the initially set page. (For example, Cyworld (http://cyworld.nate.com), blog (http://blog.naver.com)) and transmits to the multi-modal platform 30 (S42).

상기 단계(S42)에서, 상기 왑/웹 브라우저(14)가 실행된 경우, 상기 이동통신 단말(10)로 입력되는 가입자에 의한 음성 데이터와, 현재 페이지의 URL 정보와, 상기 이동통신 단말(10)의 정보를 상기 멀티모달 플랫폼(30)으로 전송하게 된다. In step S42, when the swap / web browser 14 is executed, voice data input by the subscriber input to the mobile communication terminal 10, URL information of the current page, and the mobile communication terminal 10 are executed. ) Information is transmitted to the multi-modal platform (30).

상기의 경우, 단말의 마이크(11)로부터 입력이 발생 하였는지를 판단하고, 음성입력이 발생하지 않은 경우 일반적인 웹 서핑이 수행되며, 음성입력이 발생한 경우 상기 마이크(11)로부터 입력되는 음성을 상기 엔코더(12)에서 변환하여 상기 멀티모달 모듈(13)에 전송하고, 상기 변환된 음성 데이터가 전송되면 상기 멀티모달 모듈(13)은 현재 실행중인 왑/웹 브라우저(14)로부터 브라우저 정보와 현재 페이지의 URL 정보를 요청하고, 상기 변환된 음성 데이터와 상기 획득한 현재 페이지의 URL 정보와 이동통신 단말 정보(예를 들면, 브라우저 종류, 이동통신 전화번호 등)를 무선모듈(15)을 통해 상기 멀티모달 플랫폼(30)으로 전송하게 된다. 그러나, 이는 이미 해당 기술분야의 기술자에게 공지된 기술이므로 더 이상의 설명은 생략하기로 한다.In this case, it is determined whether an input is generated from the microphone 11 of the terminal, and if a voice input does not occur, general web surfing is performed. When a voice input occurs, the voice input from the microphone 11 is inputted to the encoder ( 12) and converts the multi-modal module 13 to the multi-modal module 13. When the converted voice data is transmitted, the multi-modal module 13 transfers the browser information and the URL of the current page from the currently running swap / web browser 14; Requesting information, and converting the converted voice data, the obtained URL information of the current page and mobile communication terminal information (for example, browser type, mobile communication phone number, etc.) through the wireless module 15 through the multimodal platform. To 30. However, since this is already known to those skilled in the art, further description will be omitted.

이후, 상기 멀티모달 플랫폼(30)은 상기 이동통신 단말(10)에서 전송된 현재 페이지의 URL을 이용하여 음성인식에 필요한 음성인식 문법을 생성하고, 생성된 음성인식 문법을 음성 데이터와 함께 ASR 서버(40)로 전송한다(S43). 이때, 가입자는 상기 URL의 특정 페이지에서 음성 포스팅을 하기 위하여 녹음하기 메뉴를 선택한 후 음성녹음을 하게 된다. Thereafter, the multi-modal platform 30 generates a speech recognition grammar required for speech recognition using the URL of the current page transmitted from the mobile communication terminal 10, and generates the generated speech recognition grammar along with the voice data in the ASR server. Transfer to 40 (S43). At this time, the subscriber selects the recording menu for voice posting on a specific page of the URL and then records the voice.

또한, 상기 단계(S43)에서, 상기 멀티모달 플랫폼(30)은 음성인식을 요청하는 상기 이동통신 단말(10)과 접속하여 상기 이동통신 단말(10)로부터 전송되는 음성 데이터를 음성인식 엔진에서 처리할 수 있도록 PCM 변환함과 동시에 단말로부터 현재의 URL과 단말기 정보를 수신하고, 상기 수신된 현재의 URL 정보로부터 각 URL에 대응하는 고유 음성인식 문법과 매핑하며, 상기 매핑된 고유 음성인식 문법과 어느 페이지에서나 유효한 전역 음성인식 문법을 결합하게 된다.In addition, in step S43, the multi-modal platform 30 is connected to the mobile communication terminal 10 requesting voice recognition and processes the voice data transmitted from the mobile communication terminal 10 in the voice recognition engine. Receives the current URL and the terminal information from the terminal at the same time as the PCM conversion, and maps the unique speech recognition grammar corresponding to each URL from the received current URL information, and the mapped unique speech recognition grammar You will combine valid global speech recognition grammars on the page.

이후, 상기 음성인식 서버(40)는 상기 전송된 음성 데이터와 음성인식 문법을 이용하여 전송된 음성을 인식하고, 인식된 결과를 상기 멀티모달 플랫폼(30)으로 전송한다(S44). 이때, 상기 음성인식 방법은 이미 공지된 음성인식 방법을 사용할 수도 있으며, 바람직하게는 화자독립 음성인식 방법을 이용하게 된다.Thereafter, the voice recognition server 40 recognizes the transmitted voice using the transmitted voice data and the voice recognition grammar, and transmits the recognized result to the multi-modal platform 30 (S44). At this time, the speech recognition method may use a known speech recognition method, preferably using a speaker-independent speech recognition method.

이후, 상기 모바일 커뮤니티 서버(50)는 상기 멀티모달 플랫폼(30)으로부터 음성인식된 결과 즉, 가입자의 음성 메지시(포스팅을 위해 녹음된 가입자의 음성) 저장 요청 메시지를 수신한 후(S45), 상기 요청 메시지에 따른 컨텐츠 등록요청을 상기 웹서버(60)로 전달하게 된다(S46). Thereafter, the mobile community server 50 receives a voice recognition result from the multi-modal platform 30, that is, a voice message of the subscriber (voice of the subscriber recorded for posting), and then stores the message (S45). The content registration request according to the request message is transmitted to the web server 60 (S46).

이후, 상기 웹서버(60)는 상기 컨텐츠 등록요청에 따른 응답 메시지를 상기 모바일 커뮤니티 서버(50) 및 상기 멀티모달 플랫폼(30)을 통해 상기 이동통신 단말(10)로 전송한다(S47). 따라서, 상기의 과정을 통하여, 가입자는 자신의 음성 메시지가 포스팅 되었음을 인지하게 된다.Thereafter, the web server 60 transmits a response message according to the content registration request to the mobile communication terminal 10 through the mobile community server 50 and the multi-modal platform 30 (S47). Therefore, through the above process, the subscriber recognizes that his voice message has been posted.

도 5는 본 발명의 다른 실시예에 따른 모바일 서비스 방법을 설명하기 위한 동작 흐름도이며, 상대방의 음성 포스팅 과정을 설명하기 위한 것이다.5 is an operation flowchart for explaining a mobile service method according to another embodiment of the present invention, and for explaining a voice posting process of the counterpart.

먼저, 상기 이동통신 단말(10)의 제어부(미도시)는 특정키에 의해 상기 왑/웹 브라우저(14)를 실행한 후, 초기 설정된 페이지의 URL(예를들면, 싸이월드(http://cyworld.nate.com), 블로그(http://blog.naver.com))에서, 상대방에 의해 녹음된 음성 메시지 청취 요청 메시지를 상기 멀티모달 플랫폼(30)으로 전송한다(S51).First, the control unit (not shown) of the mobile communication terminal 10 executes the swap / web browser 14 with a specific key, and then the URL of an initially set page (for example, Cyworld (http: //)). cyworld.nate.com), and a blog (http://blog.naver.com), transmit a voice message listening request message recorded by the counterpart to the multimodal platform 30 (S51).

이후, 상기 멀티모달 플랫폼(30)은 상기 음성 메시지 청취 요청 메시지를 수신한 후 상기 모바일 커뮤니티 서버(50)로 전송한 함으로써, 음성 메시지 패치(Fetch)를 요청하게 된다(S52). 이후, 상기 모바일 커뮤니티 서버(50)는 상기 요청에 대한 응답으로 상대방에 의해 녹음된 음성 메시지가 있는 타겟 URL을 생성하여 전송함과 동시에 상기 음성 데이터 변환부(32)에 의해 변환된 상대방의 음성(예를 들면, EVRC 포맷)을 상기 이동통신 단말(10)로 전송한다(S53).Thereafter, the multi-modal platform 30 receives the voice message listening request message and transmits it to the mobile community server 50, thereby requesting a voice message patch (S52). Subsequently, the mobile community server 50 generates and transmits a target URL including a voice message recorded by the counterpart in response to the request, and simultaneously converts the voice of the counterpart converted by the voice data converter 32. For example, the EVRC format is transmitted to the mobile communication terminal 10 (S53).

이후, 상기 이동통신 단말(10)은 상기 디스플레이 장치 및 스피커를 통해 상기 수신한 URL 페이지를 디스플레이 함과 동시에 가입자에게 상대방의 녹음된 음성을 출력한다(S54). 물론, 상대방의 녹음된 음성은 리스트화 되어 있으며, 가입자가 이들 중 취사선택하여 청휘할 수도 있다.Thereafter, the mobile communication terminal 10 displays the received URL page through the display device and the speaker and outputs the recorded voice of the other party to the subscriber (S54). Of course, the recorded voice of the other party is listed, and the subscriber may choose to cheat among them.

따라서, 가입자는 시공의 제약없이 자신 만의 미니홈프에 저장된 상대방이 녹음한 음성 및 문자 메시지를 시청할 수 있게 된다.Therefore, the subscriber can watch voice and text messages recorded by the other party stored in his or her own minihomp without restriction of construction.

이상에서는 본 발명을 특정의 바람직한 실시예로서 설명하였으나, 본 발명은 상기한 실시예에 한정되지 아니하며, 특허 청구의 범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 분야에서 통상의 지식을 가진 자라면 누구 든지 다양한 변형이 가능할 것이다.Although the present invention has been described as a specific preferred embodiment, the present invention is not limited to the above-described embodiments, and the present invention is not limited to the above-described embodiments without departing from the gist of the present invention as claimed in the claims. Anyone with a variety of variations will be possible.

따라서, 본 발명에 의하면, 모바일을 통한 커뮤니티 서비스가 질적으로 향상되게 된다.Therefore, according to the present invention, the community service through the mobile is improved in quality.

또한, 본 발명은 모바일 기기 입력의 불편함을 음성을 통해 간편화 시키게 되며, 음성과 디스플레이를 동시에 제공함으로써 가입자에게 정보가 전달되는정도가 크게 향상된다. In addition, the present invention simplifies the inconvenience of the input of the mobile device through the voice, and by providing a voice and a display at the same time, the degree of information delivery to the subscriber is greatly improved.

Claims

A mobile communication terminal which accesses the Internet through a swap / web browser and transmits voice data and a current Uniform Resource Locator (URL);

A multi-modal platform for mapping a speech recognition grammar using a current URL transmitted from the mobile communication terminal and generating a target URL of a new page to be moved;

An Automatic Speech Recognition (ASR) server for recognizing speech using speech data and speech recognition grammar transmitted from the multi-modal platform;

A mobile community server that registers contents of a swap or multimodal community application by the multimodal platform according to a voice request of a subscriber and receives the result and delivers the result to the mobile communication terminal; And

And a web server for receiving a voice content registration request according to the voice of the subscriber from the mobile community server and returning the result.

The method of claim 1, wherein the mobile communication terminal,

Built-in multi-modal application, in order to provide the subscriber's homepage inquiry and voice / text posting, voice / text viewing function,

An encoder which compresses and converts the voice input through the microphone and outputs the compressed voice;

A multi-modal module configured to output voice data compressed and converted from the encoder, URL information of a current page detected from a swap / web browser, and terminal information; And

A mobile service system using a multi-modal platform, characterized in that it comprises a wireless module for receiving the voice data and the respective information from the multi-modal and wirelessly transmitted to a voice recognition system.

The method of claim 2, wherein the encoder,

8 Kbps EVRC or 13Kbps QCELP, a mobile service system using a multi-modal platform, characterized in that using different types according to the type of mobile communication terminal.

The method of claim 2, wherein the information of the terminal,

Mobile service system using a multi-modal platform, characterized in that any one of the type of swap / web browser, mobile phone number.

The method of claim 1, wherein the mobile communication terminal and the muldy modal platform,

A mobile service system using a multi-modal platform, characterized in that the voice of the subscriber and its voice recognition results are transmitted and received to each other via a TCP connection.

The method of claim 1, wherein the automatic speech recognition (ASR) server,

A mobile service system using a multi-modal platform, characterized in that the voice collected from the subscriber is distinguished by a simple voice command for reading the currently displayed menu item and a short voice command crossing the menu tree of several levels.

The method of claim 1 or 6, wherein the Automatic Speech Recognition (ASR) server,

When the voice input from the subscriber is a simple voice command, the user reads a menu item on the screen, and multi-modal is characterized in that various alternative labels (Alias) are added to Grammar for the user's convenience. Mobile service system using the platform.

When the voice input from the subscriber is a short voice command, the continuous word recognition is implemented, and the Augmented Backus-Naur Form (ABNF) format is added to Grammar, wherein the mobile service system using a multi-modal platform.

Wap / web browser accessing the Internet through a wireless network, encoder converting voice data input into the microphone, URL information of the site currently being accessed from the swap / web browser, voice data converted by the encoder and information of the terminal are transmitted. A mobile communication terminal having a multi-modal module;

Determining a grammar required for speech recognition from the current URL information transmitted from the mobile communication terminal, and transmits a speech recognition grammar with the voice data, and generates a target URL to be accessed through the result recognized by the speech recognition server Multi-modal platform for transmitting to the mobile communication terminal;

An ASR server for recognizing speech using speech data and speech recognition grammar transmitted from the multi-modal platform, and transmitting the recognized result to the multi-modal platform;

The method of claim 9, wherein the encoder,

The method of claim 9, wherein the information of the terminal,

10. The method of claim 9, wherein the mobile communication terminal and the muldy modal platform,

The method of claim 9, wherein the automatic speech recognition (ASR) server,

A mobile service system using a multi-modal platform, characterized in that the voice collected from the subscriber is classified into a simple voice command that reads a currently displayed menu item and a short voice command that traverses a menu tree of several levels.

The method of claim 9 or 13, wherein the ASR (Automatic Speech Recognition) server,

A) the controller of the mobile communication terminal determines whether a swap / web browser is executed by a specific key, and when the swap / web browser is executed, the controller converts and stores the voice input into the terminal microphone in the encoder and stores the Transmitting the information along with the URL of the initially set page;

B) generating, by the multi-modal platform, a speech recognition grammar required for speech recognition using the URL of the current page transmitted from the mobile communication terminal, and transmitting the generated speech recognition grammar to the ASR server along with the speech data;

C) the voice recognition server, recognizing the transmitted voice using the transmitted voice data and the voice recognition grammar, and transmits the recognized result to the multi-modal platform;

D) the mobile community server, after receiving the subscriber's recorded voice message storage request message from the multi-modal platform, forwarding the content registration request according to the request message to a web server; And

E) the web server, the mobile service method using a multi-modal platform, characterized in that for transmitting the response message according to the content registration request to the mobile communication terminal through the mobile community server and the multi-modal platform. .

The method of claim 16, wherein the voice of step (A) is

A mobile service method using a multi-modal platform characterized by being divided into simple voice commands and short voice commands that cross a multi-level menu tree.

The method of claim 16, wherein step (A) is

A-1) multi-modal platform comprising the step of transmitting the voice data by the subscriber input to the mobile communication terminal, the URL information of the current page, the information of the mobile communication terminal to the multi-modal platform; Mobile service method used.

The method of claim 18, wherein step (A-1),

It is determined whether the input is generated from the microphone of the terminal, and if the voice input is not generated, general web surfing is performed. When the voice input occurs, the voice input from the microphone is converted by the encoder and transmitted to the multi-modal module. When the converted voice data is transmitted, the multi-modal module requests browser information and URL information of the current page from a currently running swap / web browser, and converts the converted voice data and the URL information of the obtained current page and the mobile communication terminal. Mobile service method using a multi-modal platform, characterized in that for transmitting information to the multi-modal platform through a wireless module.

The method of claim 16, wherein step (A) is

If the swap / web browser is not executed, the mobile service method using a multi-modal platform, characterized in that it further comprises the step of performing a general function of the mobile communication terminal.

The method of claim 16, wherein step (B) is

Subscriber, the mobile service method using a multi-modal platform, characterized in that the voice recording after selecting a recording menu for the voice posting on a specific page of the URL.

The method of claim 16, wherein in step (B),

The multi-modal platform is connected to the mobile communication terminal requesting voice recognition and receives the current URL and terminal information from the terminal while converting the PCM to perform voice recognition processing of the voice data transmitted from the mobile communication terminal. And mapping from the received current URL information to a unique speech recognition grammar corresponding to each URL, and combining the mapped unique speech recognition grammar with a global speech recognition grammar valid on any page. Mobile service method.