KR20110030947A

KR20110030947A - Internet protocol television broadcasting system, server and apparatus for generating lexicon

Info

Publication number: KR20110030947A
Application number: KR1020090088629A
Authority: KR
Inventors: 왕지현; 정의석; 강병옥
Original assignee: 한국전자통신연구원
Priority date: 2009-09-18
Filing date: 2009-09-18
Publication date: 2011-03-24
Also published as: KR101606170B1

Abstract

PURPOSE: An IPTV broadcasting system, a server and an apparatus for generating a lexicon are provided to offer a lexicon for voice recognition of the IPTV. CONSTITUTION: A pattern database(111) stores a voice generation pattern in order to offer an IPTV(Internet Protocol Television) broadcast service through the voice recognition. Using the voice generation pattern, a lexicon generating unit(110) generates a lexicon for the voice recognition corresponding to a voice command of a user. A structure information extraction unit extracts the extraction type keyword from the IPTV broadcasting information data.

Description

IPTV Broadcasting System, Server and Apparatus for Generating Lexicon {Internet Protocol Television Broadcasting System, Server and Apparatus for Generating Lexicon}

본 발명은 IPTV(Internet Protocol Television) 방송 서비스에 관한 것으로서, 구체적으로는 사용자의 음성을 인식하여 그에 대응하는 IPTV 방송 서비스를 제공할 수 있는 IPTV 방송 시스템, 서버 및 발성목록 생성 장치에 관한 것이다.The present invention relates to an IPTV broadcasting service, and more particularly, to an IPTV broadcasting system, a server, and a voice list generating apparatus capable of recognizing a user's voice and providing an IPTV broadcasting service corresponding thereto.

본 발명은 지식경제부의 IT성장동력핵심기술개발사업의 일환으로 수행한 연구로부터 도출된 것이다[과제관리번호: 2006-S-036-04, 과제명: 신성장동력산업용 대용량 대화형 분산 처리 음성인터페이스 기술개발].The present invention is derived from a study conducted as part of the IT growth engine core technology development project of the Ministry of Knowledge Economy. [Task management number: 2006-S-036-04, Task name: Large-capacity interactive distributed processing voice interface technology for the new growth engine industry] Development].

디지털 기술이 발전함에 따라, TV을 통한 광대역 통합망(BcN : Broadband Convergence Network)을 통한 방송 서비스가 늘고 있는 추세이며 특히, IP 기반 통신망에 연결된 TV를 통해 공중파, 유선, 지상파, 영화, 음악, 양방향 퀴즈쇼, TV 뱅킹, 인터넷 검색 등의 단방향/양방향 서비스를 제공할 수 있는 IPTV 방송 서비스가 주목받고 있다.With the development of digital technology, broadcasting services through the Broadband Convergence Network (BcN) are increasing. In particular, over the air, wired, terrestrial, film, music, and two-way systems are provided through a TV connected to an IP-based communication network. IPTV broadcasting services that can provide one-way / two-way services such as quiz shows, TV banking, and Internet search are attracting attention.

IPTV 방송 서비스에서, 제공자는 헤드 엔드(Head-End)를 통해 사용자가 선택한 채널의 콘텐츠를 전송하며, 댁내 사용자는 인터넷 접속가능한 셋탑 박스(STB; Set-Top Box)와 그에 연결된 TV를 통해 IPTV 방송 서비스를 제공받을 수 있다.In an IPTV broadcasting service, a provider transmits content of a channel selected by a user through a head-end, and an indoor user broadcasts an IPTV through a set-top box (STB) and a TV connected thereto. You can get services.

종래의 TV는 사용자 조작의 편의를 위해 고작 리모컨을 제공하였지만, IPTV는 음성 인식 기술을 접목하여 특정 메뉴 선택, 특정 명령 입력, 특정 채널 시청 요구나, 콘텐츠 선택 등을 사용자 음성을 인식하여 조작할 수 있는 편의를 제공하고 있다.Conventional TVs provide a remote control only for the convenience of user operation, but IPTV can recognize and manipulate a user's voice to select a specific menu, input a specific command, request a specific channel, or select a content by incorporating voice recognition technology. It provides convenience.

이러한 음성 인식 시스템은 발음의 형태에 따라 고립어 인식 방법과 연속어 인식 방법을 적용한다.Such a speech recognition system applies the isolated word recognition method and the continuous word recognition method according to the pronunciation form.

먼저, 휴대폰 음성 다이얼링(Voice Dialing) 등에 사용되는 고립어 인식 방법은 각 단어를 끊어서 읽고, 단어 앞뒤에 상당한 묵음 구간이 존재하여 단어의 처음과 끝을 파악하기 쉽고 인식률도 높은 장점이 있다.First, the isolated word recognition method used in mobile phone voice dialing or the like is read by cutting off each word, and there is a considerable silence section before and after the word, so it is easy to grasp the beginning and end of the word and has a high recognition rate.

그리고, 연속어 인식 방법은 문당 단위로 음성을 인식하고, 평상시의 발성 문장을 인식하며, 각 문장은 특별히 단어 사이의 묵음을 추가하지 않는다. 따라서, 연속어 인식 방식은 한 단어의 특성이 인접한 단어의 발음으로 인해 영향을 받는 조음 효과(Coarticulation Effect) 때문에 음성 인식이 다소 어렵다.In addition, the continuous word recognition method recognizes speech in units of sentence, recognizes normal speech sentences, and each sentence does not add silence between words. Therefore, the continuous speech recognition method is somewhat difficult to recognize speech due to a coarticulation effect in which the characteristics of one word are affected by the pronunciation of adjacent words.

때문에, 국내외 대부분의 음성 인식 시스템은 한정된 응용 범위 내에서 인식성능이 높은 고립어 인식 방법을 채택하고 있다. 그런데, 고립어 인식 방법은 고립어 인식을 위해 인식할 어휘를 사전에 미리 준비해야 하여 이를 효과적으로 준비할 수 있는 방안이 필요하다.Therefore, most domestic and foreign speech recognition systems adopt the isolated word recognition method with high recognition performance within a limited application range. However, the isolated word recognition method needs to prepare a vocabulary to be recognized in advance to recognize the isolated word in advance, so a method for effectively preparing the isolated word is required.

전술한 문제점을 해결하기 위하여, 본 발명의 목적은 다양한 발성 형식에 맞추어 음성인식을 위한 발성어의 발성목록을 생성할 수 있는 IPTV 방송 시스템, 서버 및 발성목록 생성 장치를 제공함에 있다.In order to solve the above problems, an object of the present invention is to provide an IPTV broadcasting system, a server and a voice list generating apparatus capable of generating a voice list of voice words for voice recognition according to various voice types.

본 발명의 일면에 따른 IPTV 방송의 발성목록 생성 장치는, 음성인식을 통해 IPTV 방송 서비스를 제공하기 위한 하나 또는 그 이상의 발성어 생성 패턴을 저장하는 패턴 데이터베이스; 및 상기 발성어 생성 패턴을 이용하여 사용자의 음성 명령에 대응하는 음성인식용 발성어의 발성목록을 생성하는 발성목록 생성부를 포함하는 것을 특징으로 한다.In accordance with an aspect of the present invention, an apparatus for generating a voice list of an IPTV broadcast includes: a pattern database configured to store one or more voice word generation patterns for providing an IPTV broadcast service through voice recognition; And a utterance list generator configured to generate a utterance list of voice recognition utterances corresponding to a user's voice command using the utterance generation pattern.

본 발명의 다른 면에 따른 IPTV 방송 시스템은, 사용자 음성 명령에 따른 IPTV 방송 서비스를 제공하도록, 사용자의 음성 명령에 대응하는 하나 또는 그 이상의 발성어를 포함하는 발성목록을 생성 및 제공하는 IPTV 방송 서버; 및 사용자 음성 명령을 인식하고 상기 발성목록을 이용하여 인식한 상기 음성 명령을 해석 및 처리하는 IPTV 셋탑 장치를 포함하는 것을 특징으로 한다.An IPTV broadcasting system according to another aspect of the present invention is an IPTV broadcasting server for generating and providing a speech list including one or more speech words corresponding to a user's voice command to provide an IPTV broadcasting service according to a user voice command. ; And an IPTV set-top device that recognizes a user voice command and interprets and processes the voice command recognized using the speech list.

본 발명의 또 다른 면에 따른 IPTV 방송 서버는, IPTV 방송 데이터를 저장하는 제1 데이터베이스; 상기 IPTV 방송 데이터를 이용하여 사용자 명령에 대응하는 발성목록을 생성하는 발성목록 생성부; 생성된 상기 발성목록을 저장하는 제2 데이터베이스; 상기 발성목록을 이용하여 인식된 음성을 해석하는 명령어 해석부; 해석 된 상기 음성에 대응하는 IPTV 방송 데이터를 검색하거나, 해석된 상기 음성에 대응하는 IPTV 방송의 애플리케이션을 실행하는 명령 처리부; 및 상기 검색된 IPTV 방송 데이터 또는 상기 애플리케이션의 실행 결과를 댁내 장치로 전송하는 전송부를 포함하는 것을 특징으로 한다.According to another aspect of the present invention, an IPTV broadcast server includes: a first database for storing IPTV broadcast data; A voice list generator for generating a voice list corresponding to a user command using the IPTV broadcast data; A second database for storing the generated voice list; A command interpreter that interprets the recognized speech using the speech list; A command processor for searching IPTV broadcast data corresponding to the interpreted voice or executing an application of the IPTV broadcast corresponding to the interpreted voice; And a transmitter for transmitting the searched IPTV broadcast data or an execution result of the application to an indoor device.

본 발명에 따르면, 발성어 유형을 자세히 분류 및 정의하고, 각 유형을 고려하여 발성어를 생성함으로써, 사용자의 다양한 발성에 대응하는 IPTV 음성인식을 위한 발성목록을 제공할 수 있는 효과가 있다.According to the present invention, by categorizing and defining the voice word types in detail and generating voice words in consideration of each type, there is an effect of providing a voice list for IPTV voice recognition corresponding to various voices of the user.

전술한 바와 같이, 근래의 IPTV 방송은 키보드와 같은 터치 형태의 인터페이스뿐만 아니라, 음성 인식용 인터페이스를 함께 제공하고 있다. 효과적인 음성 인식용 인터페이스를 제공하기 위하여, 본 발명은 텍스트 기반으로 제작된 IPTV 방송의 방송정보데이터로부터 발성 어휘에 관련된 발성어를 추출 및 분류하여 사용자의 다양한 발성에 대응하는 발성목록을 생성하여 제공한다.As described above, recent IPTV broadcasts provide not only a touch interface such as a keyboard but also an interface for voice recognition. In order to provide an effective voice recognition interface, the present invention extracts and classifies utterances related to uttered words from broadcasting information data of IPTV broadcasting, which is produced on a text basis, to generate and provide utterance lists corresponding to various utterances of users. .

이해를 돕기 위하여, 본 발명의 구성을 설명하기에 앞서 본 발명에 의해 생성되는 다양한 음성 인식을 위한 발성목록을 구성하는 발성어에 대하여 설명한다.For the sake of understanding, prior to describing the configuration of the present invention, a spoken word constituting a speech list for various speech recognitions generated by the present invention will be described.

발성어는 그 내용에 따라 기기조작 명령어, 메뉴조작 명령어, 콘텐츠 핵심어, 단일형 영역주제어, 복합형 영역주제어 및 제약형 영역주제어 등이 있고, 그 형식에 따라 단일 발성어와 자연어 발성어 등이 있으며, 이하 각 발성어에 대하여 간략히 설명한다.According to the contents, there are device operation commands, menu operation commands, content keywords, single domain main control, complex domain main control, and restricted domain main control, and according to the format, there are single voice and natural language voice. Hereinafter, each uttered word will be briefly described.

기기조작 명령어는 IPTV 셋탑 장치를 조작하는 명령어로서, 예컨대 'Turn On', '전원 꺼', '볼륨 올려', '볼륨 4칸 아래로' 등이다.Device operation commands are commands for manipulating IPTV set-top devices, such as 'Turn On', 'Power Off', 'Volume Up', and 'Volume Down'.

메뉴조작 명령어는 IPTV 방송 서비스의 사용자 인터페이스(UI: User Interface)를 조작하여 메뉴 전환 또는 콘텐츠 요청 등을 수행하는 애플리케이션 수행 명령어로서, 예컨대 '상위 메뉴로', '마이홈으로', '최신 영화 순으로', '가장 싼 가격 순으로' 등이다.The menu operation command is an application execution command that performs menu switching or content request by manipulating a user interface (UI) of the IPTV broadcasting service. For example, the menu is 'top menu', 'my home', and 'latest movie order'. And the lowest price.

콘텐츠 핵심어는 사전형 핵심어와 추출형 핵심어로 구분되며, IPTV 방송용 콘텐츠의 정보를 기술하는 IPTV 방송정보데이터로부터 추출되는 문자열이다.Content keywords are classified into dictionary keywords and extracted keywords, and are strings extracted from IPTV broadcasting information data describing information of IPTV broadcasting contents.

먼저, 사전형 핵심어는 IPTV 방송정보데이터로부터 추출된 단일명사로 구성된 단어이며, 예컨대 '이효리', '뿡뿡이', '삼성 코엑스' 등의 인명, 지명, 조직명 등을 나타내는 고유명사나, 'HIV(AIDS)', '영장실질심사제' 등과 같은 전문용어를 포함한다.First, a dictionary keyword is a word composed of a single noun extracted from IPTV broadcasting information data. For example, a proper noun representing a person's name, place name, organization name, etc., such as' Lee Hyo-ri ',' Choi ',' Samsung COEX ', or' HIV ( AIDS) 'and' Writing Substantive Examination '.

그 다음으로, 추출형 핵심어는 IPTV 방송정보데이터로부터 추출한 본제목이나 부가정보(예컨대, 모델명, 회차 정보, 예고편 등)이며, 예컨대 IPTV 방송정보데이터가 "<괜찮아, 울지마> 예고편"인 경우 추출형 핵심어는 '괜찮아 울지마'나 '예고편'일 수 있고, "12회 VJ네트워크 - 아차산, 이천한우농장, 충주시"인 경우는 제목인 'VJ 네트워크', 회차 정보인 '12회'일 수 있으며, "캐논 DSLR EOS-500D"인 경우는 모델명인 'EOS-500D'일 수 있다.Next, the extracted key words are the main title or additional information (e.g., model name, episode information, trailer, etc.) extracted from the IPTV broadcasting information data. For example, if the IPTV broadcasting information data is "<okay, don't cry> trailer" The key words may be 'do not cry alright' or 'trailer', and in the case of "12th VJ Network-Achasan, Icheon Hanwoo Farm, Chungju-si," the title may be 'VJ Network' and the round information '12times'. Canon DSLR EOS-500D "may be the model name 'EOS-500D'.

단일형 영역주제어(또는, 영역주제어)는 VOD콘텐츠뿐만 아니라, 쇼핑이나 날씨와 같은 정보서비스 등에서 빈번하게 사용되는, 분류를 위하여 사용되는 분류어, 특징을 설명하는 용어, 특정영역에서 자주 사용되는 용어 등이다. 예컨대, 쇼핑영역에서 '가전기기', '가구', '도서', '사무용품' 등의 분류어, VOD 콘텐츠의 '액션', '드라마', 'SF' 등과 같은 장르어, 날씨영역의 '최고온도', '최저온도', '습도', '강수량' 등이다.Single domain main control (or domain main control) is not only VOD content, but also frequently used in information services such as shopping or weather, classification words used for classification, terms describing features, terms frequently used in specific areas, etc. to be. For example, in the shopping area, taxonomy such as 'apparatus', 'furniture', 'book', 'office supplies', genres such as 'action', 'drama', 'SF', etc. Temperature ',' minimum temperature ',' humidity 'and' precipitation '.

복합형 영역주제어는 2개 이상의 단일형 영역주제어의 결합, 또는 2개 이상의 명사들과 단일형 영역주제어의 결합으로 구성되며, 예컨대 '판타지 영화', '스릴러 영화' 또는, '다음주 날씨', '내일 주가동향', '오늘 상영프로' 등이다.Complex domain control consists of a combination of two or more single domain control, or a combination of two or more nouns and a single domain control, such as 'Fantasy Movie', 'Thriller Movie', or 'Next Week', 'Tomorrow Stock Price'. Trends, 'today's screening program.

제약형 영역주제어는 콘텐츠를 검색하기 위하여 사용되며, 검색할 대상을 제약 한정하기 위한 수식어를 사용한다. 즉, 수식어와 영역주제어의 결합으로 구성되며, 수식어는 관형어구(예컨대, 동사+관형형어미 등)로 구성될 수 있다. 예컨대, '가족이 볼만한 영화, '최근에 개봉한 영화', '가장 가까운 지하철역', '가장 싼 배드민턴 라켓' 등일 수 있다.Constrained domain control is used to search for content and uses a qualifier to limit the search target. That is, it is composed of a combination of modifiers and domain principal control, and the modifiers may be composed of tubular phrases (eg, verbs + tubular endings, etc.). For example, it may be a family-friendly movie, a movie recently released, a nearest subway station, and the cheapest badminton racket.

단일어 발성어는 서술어가 아니라 명사로 끝나는 발성어이며, 자연어 발성어는 서술어(동사+종결형 어미)인 문장형태로 구성된 발성어이다.Single word utterances are not words, but words that end with nouns. Natural words are words that consist of sentences that are predicates (verb + final ending).

이와 같이, 본 발명은 이러한 다양한 유형의 발성어들을 조합하여 효과적으로 발성목록을 생성할 수 있는 발성목록 생성 장치와, 발성목록 생성 장치를 적용한 IPTV 방송 서버 및 IPTV 방송 시스템에 대한 것이다. 이하, 도면을 참조하여 본 발명의 기술적 특징에 대하여 보다 상세히 설명한다.As described above, the present invention relates to a voice list generating apparatus capable of effectively generating a voice list by combining various types of voice words, an IPTV broadcast server and an IPTV broadcast system to which the voice list generating apparatus is applied. Hereinafter, with reference to the drawings will be described in more detail with respect to the technical features of the present invention.

이하, 도 1을 참조하여 본 발명의 실시예에 따른 IPTV 방송의 발성목록 생성 장치에 대하여 설명한다. 도 1은 본 발명의 실시예에 따른 IPTV 방송의 발성목록 생성 장치를 도시한 구성도이다.Hereinafter, an apparatus for generating a voice list of an IPTV broadcast according to an embodiment of the present invention will be described with reference to FIG. 1. 1 is a block diagram showing an apparatus for generating a voice list of an IPTV broadcast according to an embodiment of the present invention.

도 1에 도시된 바와 같이, IPTV 방송의 발성목록 생성 장치(240)는 발성어생성패턴 데이터베이스(111) 및 발성목록 생성부(110)를 포함한다.As shown in FIG. 1, an apparatus 240 for generating a voice list of an IPTV broadcast includes a voice word generation pattern database 111 and a voice list generator 110.

발성어생성패턴 데이터베이스(111)는 음성인식을 통해 IPTV 방송 서비스를 제공하기 위한 음성인식용 발성목록을 생성하기 위한 하나 또는 그 이상의 발성어 생성 패턴을 저장한다. 여기서, 각 발성어 생성 패턴은 다양한 종류의 발성어 즉, 기기조작 명령어, 메뉴조작 명령어, 콘텐츠 핵심어, 단일형 영역주제어, 복합형 영역주제어나, 제약형 영역주제어 등을 고려하여 결정될 수 있다.The speech generating pattern database 111 stores one or more speech generating patterns for generating a speech recognition speech list for providing an IPTV broadcasting service through speech recognition. Here, each utterance generation pattern may be determined in consideration of various kinds of utterance words, that is, a device operation command, a menu operation command, a content key word, a single area main control, a complex area main control, or a restricted area main control.

여기서, 발성어생성패턴 데이터베이스(111)는 음성인식을 통하여 IPTV 방송 서비스를 제공하기 위하여 IPTV 방송 서비스에 새로운 서비스가 부가되면, 새로운 서비스의 발성어 생성 패턴을 더 포함하도록 업데이트된다.Here, when a new service is added to the IPTV broadcast service to provide an IPTV broadcast service through voice recognition, the spoken word generation pattern database 111 is updated to further include a voice word generation pattern of the new service.

발성목록 생성부(110)는 발성어생성패턴 데이터베이스(111)에 저장된 발성어 생성 패턴에 하나 또는 그 이상의 발성어를 적용하여 IPTV 방송정보데이터로부터 인식한 사용자의 음성 명령에 대응하는 다수의 발성목록을 생성한다. 즉, 발성목록 생성부(110)는 발성어 생성 패턴에 따라 하나 또는 그 이상의 발성어를 포함시켜, IPTV 방송 데이터를 검색하고자하는 사용자의 다양한 음성 명령에 대응하는 발성목록을 생성할 수 있다.The voice list generating unit 110 applies one or more voice words to the voice word generation pattern stored in the voice word generation pattern database 111 to generate a plurality of voice lists corresponding to voice commands of the user recognized from the IPTV broadcasting information data. Create That is, the voice list generator 110 may generate one voice list corresponding to various voice commands of a user who wants to search IPTV broadcast data by including one or more voice words according to the voice word generation pattern.

여기서, 사용자의 음성 명령은 기기조작 명령어, 메뉴조작 명령어, 콘텐츠 핵심어, 단일형 영역주제어, 복합형 영역주제어나, 제약형 영역주제어 등에 관련된 것이다.Here, the user's voice command relates to a device operation command, a menu operation command, a content keyword, a single area main control, a complex area main control, a restricted area main control, and the like.

이하, 하기의 표 1을 참조하여 발성어생성패턴 데이터베이스(111)에 저장된 발성어 생성 패턴에 대하여 설명한다. 여기서, 하기의 표 1은 BNF(Backus Naur Form)문법으로 정의된 발성어 생성 패턴이다.Hereinafter, a voice word generation pattern stored in the voice word generation pattern database 111 will be described with reference to Table 1 below. Here, Table 1 below is a speech generation pattern defined by the BNF (Backus Naur Form) syntax.

발성어 생성 패턴은 표 1과 같이 1개 이상의 Term으로 구성되며(가), Term은 비단말텀(NonTerminalTerm), 단말텀(TerminalTerm) 및 변수 중 어느 하나일 수 있다(나).The utterance pattern is composed of one or more Term as shown in Table 1, and the Term may be any one of a nonterminal term, a terminal term, and a variable (b).

여기서, 비단말텀은 시작문자인 "<"와, 종료문자인 ">" 사이에 존재하는 또 다른 발성어 생성 패턴이 기술된 파일 명칭이며(다), 단말텀은 시작문자인 " ' "와, 종료문자인 " ' " 사이에 존재하는 발성어문자열일 수 있다(라). 또한, 변수는 IPTV 방송정보데이터로부터 추출된 콘텐츠 핵심어로 구성된 문자열일 수 있으며(마), 반복자는 Term뒤에 포함된 '+', '*', '?'인데, '+'는 동일한 Term이 1번 이상 반복될 수 있다는 의미이며, '*'는 동일한 Term이 생략되거나, 1번 이상 반복될 수 있다는 의미이며, '?'는 동일한 Term이 생략되거나, 1번 사용된다는 의미이다(바).Here, the nonterminal term is a file name in which another utterance generation pattern existing between a start character "<" and an end character ">" is described (c), and the terminal term is a start character "" ", It may be a utterance string existing between the terminating character "" ". In addition, the variable may be a string consisting of the content key word extracted from the IPTV broadcasting information data (e), and the iterators are '+', '*' and '?' Included after the Term, and '+' is the same Term as 1 '*' Means that the same Term is omitted or can be repeated one or more times, and '?' Means that the same Term is omitted or used one time (f).

이하, 하기의 표 2를 참조하여 실제로 사용될 수 있는 발성어 생성 패턴의 예에 대하여 설명한다. 표 2는 본 발명의 실시예에 따른 발성어 생성 패턴의 예시이다.Hereinafter, an example of a utterance generation pattern that can be actually used will be described with reference to Table 2 below. Table 2 is an example of a utterance generation pattern according to an embodiment of the present invention.

표 2의 발성어 생성 패턴은 각각 '영화발성목록.txt'의 내용과 같으며, '배우명.txt', '장르명.txt', IPTV 방송정보데이터로부터 추출한 콘텐츠 핵심어 등을 참조한다.The utterance generation patterns in Table 2 are the same as the contents of 'movie utterance list.txt', respectively, and refer to 'actor name.txt', 'genre name.txt', and content keywords extracted from IPTV broadcasting information data.

발성목록 생성부(110)는 '영화발성목록.txt'에 기재된 발성어 생성 패턴을 이용하여 하기의 표 3과 같은 발성목록을 생성할 수 있다.The utterance list generation unit 110 may generate a utterance list as shown in Table 3 below by using the utterance generation pattern described in 'movie utterance list.txt'.

표 3의 VJ네트워크 12회, VJ네트워크는 "12회 VJ네트워크 - 아차산, 이천한우농장"라는 IPTV 방송정보데이터로부터 추출한 콘텐츠 핵심어를 표 2의 4행 $본제목 $회차, 5행 $본제목에 맵핑한 것이다. 즉, 발성목록 생성부(110)는 (차), (카)에 본제목인 'VJ네트워크'와, 회차인 '12회'를 맵핑하여 표 3과 같은 발성목록을 생성할 수 있다.VJ Network 12 times in Table 3, VJ Network is the content key words extracted from the IPTV broadcasting information data, "12 times VJ Network-Achasan, Icheon Hanwoo Farm" in Table 4, line 4 $ heading $ 5 times, $ 5 heading It's mapped. That is, the vocalization list generating unit 110 may generate a vocalization list as shown in Table 3 by mapping the main title 'VJ network' and the circumference '12 times' to (C) and (K).

발성목록 생성부(110)는 표 1 내지 3에 의해 예시된 방법으로 기기조작 명령어, 메뉴조작 명령어, 단일형 영역주제어, 복합형 영역주제어나, 제약형 영역주제어 등의 발성어를 하나 이상 포함시켜 발성목록을 생성할 수 있다. The voice list generating unit 110 includes one or more voice words such as a device operation command, a menu operation command, a single area main control, a complex area main control, or a restricted area main control in the manner illustrated by Tables 1 to 3 You can create a list.

한편, IPTV 방송의 발성목록 생성 장치(240)는 사용자의 음성 명령이 콘텐츠 핵심어에 관련된 것일 때, IPTV 방송정보데이터로부터 추출형 핵심어와 사전형 핵심어를 각각 추출하는 구조정보 추출부(120)와 비구조정보 추출부(130) 및 각각 추출된 핵심어를 사용자의 발음에 가까운 자연스러운 발성어로 가공하여 발성목록 생성부(110)에 제공하는 발성어 정규화부(140)를 더 포함한다. 이하, 각 부에 대하여 설명한다.On the other hand, the IPTV broadcasting utterance list generating device 240 and the structure information extraction unit 120 for extracting the extracted key words and dictionary keywords from the IPTV broadcast information data, respectively, when the user's voice command is related to the content key words, The structure information extracting unit 130 and a extracted speech normalization unit 140 for processing the extracted core words into natural pronunciation words that are close to the user's pronunciation and providing them to the speech list generation unit 110 are provided. Hereinafter, each part is demonstrated.

구조정보 추출부(120)는 IPTV 방송정보데이터에 소정기호가 포함될 때나 HTML 등의 마크업(Mark-up) 언어형식일 때 등에, IPTV 방송정보데이터로부터 정규 표현식(Regular Expression)이나 패턴규칙 등의 규칙(121)을 파악하고, 파악된 규칙을 이용하여 IPTV 방송정보데이터로부터 본제목이나 부가정보를 추출한다.The structural information extracting unit 120 uses a regular expression or a pattern rule from the IPTV broadcasting information data, for example, when a predetermined symbol is included in the IPTV broadcasting information data, or in a markup language format such as HTML. The rule 121 is identified, and the title or additional information is extracted from the IPTV broadcasting information data using the identified rule.

예컨대, 구조정보 추출부(120)는 VOD 콘텐츠에 대한 IPTV 방송정보데이터가 "<괜찮아, 울지마> 예고편"인 경우, '<'와 '>'라는 기호문자의 규칙(121)을 파악하고, 본제목인 '괜찮아 울지마'와 부가정보인 '예고편'를 추출할 수 있다.For example, when the IPTV broadcasting information data for the VOD content is "<okay, don't cry> trailer", the structure information extracting unit 120 grasps the rule 121 of the symbol characters '<' and '>'. You can extract the title 'Don't cry alright' and the additional information 'trailer'.

비구조정보 추출부(130)는 IPTV 방송정보데이터에 포함된 고유명사나 전문용어를 추출한다. 즉, 비구조정보 추출부(130)는 고유명사나 전문용어를 포함하는 사전(131)으로부터 IPTV 방송정보데이터에 포함된 문자열이 존재하는지를 확인하고, 존재하는 것으로 확인되면 이를 추출한다. 그런데, IPTV 방송정보데이터는 "EBS 방귀대장뿡뿡이"와 같이, 일반적인 사전에 등록되어 있지 않은 인명이나, 캐릭터 명이 존재할 것이므로, 콘텐츠의 방송정보데이터가 등록되지 않은 고유명사를 포함할 경우, 등록되지 않은 고유명사를 발성목록 생성부(110)의 관리자, IPTV 방송서비스 관련자 및 콘텐츠 생성자 등에 의해 사전(131)에 등록하는 절차가 필요할 수 있다.The unstructured information extraction unit 130 extracts proper nouns or terminology included in the IPTV broadcasting information data. That is, the unstructured information extracting unit 130 checks whether a character string included in the IPTV broadcasting information data exists from the dictionary 131 including proper nouns or terminology, and extracts it if it exists. However, since the IPTV broadcasting information data may have a person name or a character name which is not registered in general, such as "EBS fart blacksmith", if the broadcasting information data of the content includes a proper noun that is not registered, it is not registered. A procedure for registering the proper noun in the dictionary 131 by the manager of the voice list generation unit 110, the IPTV broadcasting service related person, the content creator, and the like may be required.

발성어 정규화부(140)는 구조정보 추출부(120) 및 비구조정보 추출부(130)에 의하여 추출된 본제목, 부가정보, 고유명사나, 전문용어 등을 기설정된 규칙에 따라 사용자의 음성 명령에 가까운 형태로 변환한다.The speech normalization unit 140 uses the user's voice command based on a predetermined rule to extract the main title, additional information, proper nouns, and terminology extracted by the structural information extractor 120 and the non-structural information extractor 130. Convert to a form close to.

예컨대, 발성어 정규화부(140)는 "MR. 후아유"라는 IPTV 방송정보데이터로부터 추출된 'MR'를 사용자의 음성 명령에 가까운 '미스터'로 변환하고, "김관장VS김관장VS김관장"이라는 IPTV 방송정보데이터로부터 추출된 'VS'를 '대'로 변환하며, "ST. ELMO'S FILE"라는 IPTV 방송정보데이터로부터 추출된 'ST'를 '세인트'로 변환하며, "9회말2아웃"이라는 IPTV 방송정보데이터로부터 추출된 '9회말2아웃'을 '구회말투아웃'으로 변환할 수 있다.For example, the utterance normalization unit 140 converts 'MR' extracted from the IPTV broadcasting information data of "MR. Huayu" into "Mister" close to the user's voice command and converts the IPTV broadcasting of "Kim Kwan-jang VS Kim Kwan-jang VS Kim Kwan-jang". Converts 'VS' extracted from information data to 'large', converts 'ST' extracted from IPTV broadcasting information data called "ST. ELMO'S FILE" to "St." The '9 end 2 out' extracted from the information data can be converted to the 'end to end out'.

이하, 도 2를 참조하여 본 발명의 실시예에 따른 IPTV 방송의 발성목록 생성 장치(240)가 적용된 IPTV 방송 시스템에 대하여 설명한다. 도 2는 본 발명의 실시예에 따른 IPTV 방송 시스템을 도시한 구성도이다.Hereinafter, an IPTV broadcasting system to which an apparatus 240 for generating a voice list of IPTV broadcasting according to an exemplary embodiment of the present invention is applied will be described with reference to FIG. 2. 2 is a block diagram showing an IPTV broadcasting system according to an embodiment of the present invention.

도 2에 도시된 바와 같이, 본 발명의 실시예에 따른 IPTV 방송 시스템(20)은 IPTV 방송 서버(200) 및 IPTV 셋탑 장치(300)를 포함한다.As shown in FIG. 2, the IPTV broadcasting system 20 according to the exemplary embodiment of the present invention includes an IPTV broadcasting server 200 and an IPTV set-top device 300.

IPTV 방송 서버(200)는 인식한 음성에 따른 IPTV 방송 서비스를 제공하도록, 사용자의 음성 명령에 대응하는 하나 또는 그 이상의 발성어를 포함하는 발성목록을 생성 및 제공하며, 방송데이터 데이터베이스(211), 정보데이터 데이터베이스(212), 관리부(250), 발성목록 생성부(240), 명령어 해석부(230), 검색부(220), 애플리케이션 수행부(260) 및 전송부(270)를 포함한다. 이하, IPTV 방송 서버(200)의 각 구성요소에 대하여 설명한다.The IPTV broadcasting server 200 generates and provides a speech list including one or more speech words corresponding to a user's voice command to provide an IPTV broadcasting service according to the recognized voice, and includes a broadcasting data database 211, The information data database 212, the manager 250, the voice list generator 240, the command interpreter 230, the searcher 220, the application performer 260 and the transmitter 270 are included. Hereinafter, each component of the IPTV broadcasting server 200 will be described.

방송데이터 데이터베이스(211)는 IPTV 방송 서비스를 위한 IPTV 방송 데이터(예컨대, 콘텐츠 및 VOD 파일 등)들을 저장하며, 정보데이터 데이터베이스(212)는 IPTV 방송정보데이터(예컨대, 콘텐츠의 타이틀 등)를 저장한다.The broadcast data database 211 stores IPTV broadcast data (eg, content and VOD files, etc.) for the IPTV broadcast service, and the information data database 212 stores IPTV broadcast information data (eg, a title of the content, etc.). .

관리부(250)는 IPTV 방송 데이터 및 방송정보데이터를 저장하고 관리하며, IPTV 방송에 신규 서비스 또는 콘텐츠가 부가될 때 방송데이터 데이터베이스(211) 및 정보데이터 데이터베이스(212)를 업데이트한다. 이때, 관리부(250)는 IPTV 방송 데이터 및 방송정보데이터를 저장날짜, 시간, 콘텐츠의 종류, 소스나, 제작자 정보 등으로 필드를 나눠 관리할 수 있다.The management unit 250 stores and manages IPTV broadcast data and broadcast information data, and updates the broadcast data database 211 and the information data database 212 when a new service or content is added to the IPTV broadcast. At this time, the management unit 250 may manage the IPTV broadcast data and broadcast information data by dividing the fields into storage date, time, content type, source, producer information, and the like.

발성목록 생성부(240)는 인식한 음성에 따른 IPTV 방송 서비스를 제공하도록, 하나 또는 그 이상의 발성어를 포함시켜 발성목록을 생성하고, 생성된 발성목록을 발성목록 데이터베이스(241)에 저장한다. 그런데, 방송데이터는 지속적으로 새롭게 업데이트되나, 발성목록 데이터베이스(241)의 저장공간은 한정적이므로, 발성목록 생성부(240)(또는, 관리부(250))는 발성목록과 방송정보데이터를 연관지어 저장함으로써, 발성목록 데이터베이스(241)의 사용되지 않는 발성목록을 삭제하는 등의 관리를 수행할 수 있다.The voice list generator 240 generates a voice list by including one or more voice words to provide an IPTV broadcasting service according to the recognized voice, and stores the generated voice list in the voice list database 241. By the way, broadcast data is constantly updated, but since the storage space of the utterance list database 241 is limited, the utterance list generation unit 240 (or the management unit 250) stores the utterance list in association with the broadcast information data. By doing so, it is possible to perform management such as deleting an unused talk list of the talk list database 241.

발성목록 생성부(240)는 전술한 표 1 내지 3의 방법을 통하여 발성목록을 생성할 수 있으며, 생성된 발성목록을 실시간 또는 주기적으로 IPTV 셋탑 장치(300)로 전송하여 IPTV 셋탑 장치(300)가 새로운 IPTV 방송을 인지하도록 한다.The voice list generating unit 240 may generate a voice list through the method of Tables 1 to 3 described above, and transmits the generated voice list to the IPTV set-top device 300 in real time or periodically. To recognize a new IPTV broadcast.

명령어 해석부(230)는 하나 또는 그 이상의 발성목록을 이용하여 인식된 사용자 음성 명령을 해석한다. 즉, 명령어 해석부(230)는 IPTV 셋탑 장치(300)로부터 사용자 음성 명령에 대응하는 문자열을 수신 및 해석하고, 해석된 명령이 IPTV 방송 데이터 검색을 요청하는 명령이면 검색부(220)로 전달하고, IPTV 방송의 애플리케이션 실행을 요청하는 명령이면 애플리케이션 수행부(260)로 전달한다. 여기서, 애플리케이션은 인터넷 검색, 홈쇼핑 검색, 구매, 날씨 조회, 주식 조회, 증권 조회 등에 관련된 것일 수 있다.The command interpreter 230 interprets the recognized user voice command using one or more voice lists. That is, the command interpreter 230 receives and interprets a character string corresponding to a user voice command from the IPTV set-top device 300, and transmits the interpreted command to the searcher 220 if the interpreted command is a command for requesting IPTV broadcast data search. If a command for requesting the execution of the application of the IPTV broadcast is transmitted to the application execution unit 260. Here, the application may be related to internet search, home shopping search, purchase, weather inquiry, stock inquiry, stock inquiry, and the like.

검색부(220)는 명령어 해석부(230)로부터 IPTV 방송 데이터 검색을 요청하는 명령을 전달받아 방송데이터 데이터베이스(211)로부터 해석된 음성 명령에 대응하는 IPTV 방송 데이터를 검색한다.The search unit 220 receives a command for requesting IPTV broadcast data search from the command interpreter 230 and searches for IPTV broadcast data corresponding to the voice command interpreted from the broadcast data database 211.

애플리케이션 수행부(260)는 명령어 해석부(230)로부터 IPTV 방송의 애플리케이션 실행을 요청하는 명령을 전달받아, 해석된 명령에 대응하는 IPTV 방송의 애플리케이션을 실행한다. 상세하게는, 애플리케이션 수행부(260)는 애플리케이션 데이터베이스(261)에 저장된 다수의 애플리케이션을 실행하고, 애플리케이션 실행 결과를 전송부(270)로 전달한다.The application execution unit 260 receives a command for requesting the execution of the application of the IPTV broadcast from the command interpreter 230, and executes the application of the IPTV broadcast corresponding to the interpreted command. In detail, the application execution unit 260 executes a plurality of applications stored in the application database 261, and transmits the application execution result to the transmission unit 270.

예컨대, 애플리케이션 수행부(260)는 수신한 문자열이 "가장 싼 배드민턴 라켓"이면, 이를 SQL문으로 만들어 애플리케이션 데이터베이스(261)에 저장된 쇼핑 데이터 중에서 배드민턴 라켓 상품을 정렬하고, 그로부터 가장 가격이 낮은 배드민턴 라켓을 검색하여 IPTV 셋탑 장치(300)에 제공할 수 있다.For example, if the received string is the "cheapest badminton racket", the application execution unit 260 makes an SQL statement to sort badminton racket products among the shopping data stored in the application database 261, and the lowest price badminton racket therefrom. The search may be provided to the IPTV set-top device 300.

또는, 수신한 문자열이 "오늘 대전지역 날씨"이면, SQL문을 이용하여 애플리케이션 데이터베이스(261)에 저장된 날씨정보에 대해 날짜 필드와 지역 필드에 각각 오늘 날짜와 대전 지역을 입력하고, 날씨정보를 검색하여 IPTV 셋탑 장치(300)에 제공할 수 있다.Or, if the received string is "Today's weather in Daejeon region", the current date and the Daejeon region are entered in the date field and the region field, respectively, for the weather information stored in the application database 261 using an SQL statement, and the weather information is searched. To the IPTV set-top device 300.

전송부(270)는 검색부(220)에 의하여 검색된 IPTV 방송 데이터 또는 애플리케이션 수행부(260)의 애플리케이션 실행에 따른 결과(예컨대, UI 메뉴)를 IPTV 셋탑 장치(300)로 전송한다.The transmitter 270 transmits the IPTV broadcast data searched by the searcher 220 or a result (eg, a UI menu) of the application execution of the application execution unit 260 to the IPTV set-top device 300.

한편, IPTV 방송 서버(200)의 검색부(220)와 애플리케이션 수행부(260)는 별개의 구성요소로 구분되지 않고, 데이터 검색 또는 애플리케이션 실행을 수행하는 처리부(미도시)로 병합되어 구성될 수 있음은 물론이다. 이하, IPTV 셋탑 장치(300)에 대하여 설명한다.Meanwhile, the search unit 220 and the application execution unit 260 of the IPTV broadcasting server 200 may not be divided into separate components, but may be merged into a processing unit (not shown) that performs data search or application execution. Of course. Hereinafter, the IPTV set top apparatus 300 will be described.

IPTV 셋탑 장치(300)는 사용자 음성 명령을 인식하고 상기 하나 또는 그 이상의 발성목록을 이용하여 인식한 음성 명령을 해석 및 처리하며, 발성목록 데이터베이스(310), 음성 인식부(320), 명령어 해석부(330), 제어부(340) 및 표시부(350)를 포함한다.The IPTV set-top device 300 recognizes a user's voice command and interprets and processes the recognized voice command using the one or more voice lists, and the voice list database 310, the voice recognizer 320, and the command interpreter 330, a controller 340, and a display 350.

발성목록 데이터베이스(310)는 IPTV 방송 서버(200)로부터 하나 또는 그 이상의 발성목록을 제공받아 저장한다. 이때, 발성목록 데이터베이스(310)는 새로운 IPTV 방송 또는 그 서비스에 대한 발성목록을 IPTV 방송 서버(200)로부터 실시간 또는 주기적으로 제공받는다.The voice list database 310 receives and stores one or more voice lists from the IPTV broadcasting server 200. At this time, the voice list database 310 receives a voice list for a new IPTV broadcast or a service thereof from the IPTV broadcast server 200 in real time or periodically.

음성 인식부(320)는 사용자 음성 명령을 인식하고, 하나 또는 그 이상의 발성목록을 이용하여 음성 명령을 문자열로 변환한다.The voice recognition unit 320 recognizes a user voice command and converts the voice command into a string using one or more voice lists.

명령어 해석부(330)는 음성 인식부(320)로부터 전달받은 문자열이 기기조작 명령어인지를 확인하고, 기기조작 명령어가 아니면 문자열에 대응하는 IPTV 방송 서비스를 요청한다. 그러면, IPTV 방송 서버(200)의 명령어 해석부(330)가 문자열을 수신하고, IPTV 방송 데이터를 요청하는 명령어인지, IPTV 방송의 애플리케이션의 수행을 요청하는 명령인지를 해석하고, 해석결과에 따라 검색부(220) 또는 애플리케이션 수행부(260)로 해석된 명령을 전달한다.The command interpreter 330 checks whether the string received from the voice recognition unit 320 is a device manipulation command, and requests an IPTV broadcasting service corresponding to the text string if it is not a device manipulation command. Then, the command interpreter 330 of the IPTV broadcast server 200 receives a character string, and interprets whether the command is a command for requesting IPTV broadcast data or a request for execution of an application of IPTV broadcast, and searches according to the analysis result. The interpreted command is transmitted to the unit 220 or the application execution unit 260.

제어부(340)는 명령어 해석부(330)로부터 기기조작 명령어를 전달받으면, 그에 따라 IPTV 셋탑 장치(300)의 동작을 제어한다. 이때, 기기조작 명령어는 전술한 바와 같이 IPTV 셋탑 장치를 조작하는 명령어로서, 예컨대 'Turn On', '전원 꺼', '볼륨 올려', '볼륨 4칸 아래로' 등이다.When the controller 340 receives the device operation command from the command interpreter 330, the controller 340 controls the operation of the IPTV set-top device 300 accordingly. At this time, the device operation command is a command for operating the IPTV set-top device as described above, for example, 'Turn On', 'Power Off', 'Volume Up', 'Volume Down 4', and the like.

표시부(350) 명령어 해석부(330)의 요청에 따라 검색되거나, 애플리케이션의 실행에 따라 제공되는 IPTV 방송 서비스를 표시한다. 이러한, 방식으로 본 발명은 음성인식에 따른 IPTV 방송 서비스를 제공할 수 있다.The display unit 350 retrieves the IPTV broadcast service provided according to a request of the command interpreter 330 or provided according to the execution of the application. In this way, the present invention can provide an IPTV broadcasting service according to voice recognition.

이상, 본 발명의 구성에 대하여 첨부 도면을 참조하여 상세히 설명하였으나, 이는 예시에 불과한 것으로서, 본 발명이 속하는 기술분야에 통상의 지식을 가진자라면 본 발명의 기술적 사상의 범위 내에서 다양한 변형과 변경이 가능함은 물론이다. 따라서 본 발명의 보호 범위는 전술한 실시예에 국한되어서는 아니되며 이하의 특허청구범위의 기재에 의하여 정해져야 할 것이다.While the present invention has been described in detail with reference to the accompanying drawings, it is to be understood that the invention is not limited to the above-described embodiments. Those skilled in the art will appreciate that various modifications, Of course, this is possible. Accordingly, the scope of protection of the present invention should not be limited to the above-described embodiments, but should be determined by the description of the following claims.

도 1은 본 발명의 실시예에 따른 IPTV 방송의 발성목록 생성 장치를 도시한 구성도.1 is a block diagram showing an apparatus for generating a voice list of an IPTV broadcast according to an embodiment of the present invention.

도 2는 본 발명의 실시예에 따른 IPTV 방송 시스템을 도시한 구성도.2 is a block diagram showing an IPTV broadcasting system according to an embodiment of the present invention.

Claims

A pattern database for storing one or more voice word generation patterns for providing an Internet Protocol Television (IPTV) broadcasting service through voice recognition; And

A voice list generator for generating a voice list of voice recognition voice words corresponding to a user's voice command using the voice word generation pattern

Voice list generating apparatus of the IPTV broadcast comprising a.

The method of claim 1,

The utterance list includes one or more utterances including content keywords, device operation commands, menu operation commands, single area control, complex area control, or restricted area control.

The content keyword is a voice list generating apparatus of the IPTV broadcast, which is composed of a character string included in the IPTV broadcast information data.

The method of claim 1,

A structure information extracting unit which extracts an extractable key word from the IPTV broadcasting information data;

Unstructured information extraction unit for extracting a dictionary key word from the IPTV broadcast information data; And

A speech normalization unit which processes the extracted core words into natural pronunciation words that are close to the user's pronunciation and provides them to the speech list generation unit.

Voice list generation device of the IPTV broadcast further comprising a.

The method of claim 1, wherein the pattern database,

When a new service is added to the IPTV broadcasting service, the voice list generating apparatus of the IPTV broadcasting, which is updated to further include a voice word generation pattern for the new service.

An IPTV broadcasting server for generating and providing a speech list including one or more speech words corresponding to a user's voice command to provide an Internet Protocol Television (IPTV) broadcasting service according to a user voice command; And

An IPTV set-top device that recognizes user voice commands and interprets and processes the voice commands recognized using the speech list.

IPTV broadcasting system comprising a.

The method of claim 5, wherein the IPTV set-top device,

A voice list database for receiving and storing the voice list;

A voice recognition unit recognizing the user voice command and converting the voice command into a character string using the speech list;

A command interpreter which checks whether the character string is a device operation command and requests an IPTV broadcast service corresponding to the character string to the IPTV broadcast server when the character string is not a device operation command;

A control unit controlling an operation of the IPTV set-top device according to the device operation command; And

Display unit for displaying the IPTV broadcast service provided according to the request

IPTV broadcasting system comprising a.

The method of claim 5, wherein the IPTV broadcast server,

A broadcast data database for storing IPTV broadcast data;

A voice list generator for generating a voice list including the one or more voice words to provide an IPTV broadcasting service according to the recognized voice;

A command interpreter that interprets the recognized voice command using the speech list;

A search unit for searching IPTV broadcast data corresponding to the interpreted voice command from the database;

An execution unit executing an application of an IPTV broadcast corresponding to the interpreted voice command; And

Transmitter for transmitting the searched IPTV broadcast data or the results of the application execution to the IPTV set-top device

IPTV broadcasting system comprising a.

A first database for storing IPTV broadcast data;

A voice list generator for generating a voice list corresponding to a user command using the IPTV broadcast data;

A second database for storing the generated voice list;

A command interpreter that interprets the recognized speech using the speech list;

A command processor for searching IPTV broadcast data corresponding to the interpreted voice or executing an application of the IPTV broadcast corresponding to the interpreted voice; And

Transmission unit for transmitting the searched IPTV broadcast data or the execution result of the application to the indoor device

IPTV broadcast server comprising a.

The method of claim 8, wherein the voice list generating unit,

And generating the utterance list using a utterance generation pattern including at least one of unstructured data and structured data for IPTV broadcasting.

The method of claim 8, wherein the second database,

And stored in association with the first database.