KR100872191B1

KR100872191B1 - System and method for enhancing live speech with information accessed from the world wide web

Info

Publication number: KR100872191B1
Application number: KR1020057003143A
Authority: KR
Inventors: 카로 페르난도 인세르티스
Original assignee: 인터내셔널 비지네스 머신즈 코포레이션
Priority date: 2002-09-27
Filing date: 2003-07-23
Publication date: 2008-12-09
Also published as: JP4378284B2; US20050251394A1; AU2003267006A1; US7865367B2; JP2006500665A; TW200414021A; US7505907B2; US20090228275A1; WO2004029831A2; CN100530175C; WO2004029831A3; TWI259970B; CN1685339A; KR20050047104A

Abstract

The present invention relates to a system, method, and computer program that enables an audience listening to a live speech to access complementary information related to a pronunciation term during the course of the presentation immediately after or after a period of time. do. The system associates hyperlinks (ie, URLs) with optional terms and words that are likely to be pronounced by the presenter during the presentation process. A speech recognition system operating on a presenter device (i.e., a computer system with a microphone connected) recognizes the pronunciation of one of the hyperlink terms by the presenter during the course of the presentation (i.e. word spotting). Record the time at which each recognized hyperlink term is pronounced. The system also provides the presenter device and multiple audience devices (e.g., workstations, portable computers, personal digital assistants, smart phones, or any other type) according to the same universal time. Portable computer devices, etc.) are synchronized so that they are transmitted by the presenter regardless of the relative position of the presenter and the audience, and the flow of information received by the audience is always synchronized. Whenever the audience recognizes a topic of interest during the presentation process, the audience simply selects the topic by pressing a key provided on the audience device. The world standard time at which the audience selects a topic is stored in the audience device.

Description

TECHNICAL AND METHOD FOR ENHANCING LIVE SPEECH WITH INFORMATION ACCESSED FROM THE WORLD WIDE WEB}

본 발명은 일반적으로 생중계 이벤트(live event) 내에 상보적 정보의 액세스 수단을 포함시키는 기법에 관한 것으로, 보다 구체적으로는 생중계 발표(live speech) 또는 생방송 프로그램 등과 같은 생중계 이벤트를, 정보 특히, 월드 와이드 웹(World Wide Web)에서 액세스 가능한 정보를 가지고 향상시키는 시스템 및 방법에 관련된다.FIELD OF THE INVENTION The present invention generally relates to a technique for including complementary information access means in a live event, and more particularly to a live broadcast event such as a live speech or live program, in particular information, world wide. A system and method for improving with information accessible on the World Wide Web is disclosed.

현 사회에서는, 정보 및 서비스에 대한 욕구가 증대되고 있다. 생중계 이벤트를 청취하는 대부분의 관객(예를 들면, 발표자에 의해 제시되는 컨퍼런스를 청취하는 청중) 또는 생중계 라디오 또는 텔레비전 방송 프로그램을 청취하는 청중은, 상보적 정보에 대한 액세스를 갖고 싶어한다. 이러한 상보적 정보는 풋볼 경기의 선수에 대한 일대기, 뉴스 프로그램에서 언급되는 이벤트에 대한 역사적 배경 또는 올림픽 경기의 방송 동안의 경기 기록으로 이루어질 수 있다.In the present society, the desire for information and services is increasing. Most of the audience listening to the live event (eg, the audience listening to the conference presented by the presenter) or the audience listening to the live radio or television broadcast program would like to have access to complementary information. This complementary information may consist of a biography of the players of a football game, a historical background of the events mentioned in the news program, or a record of the games during the broadcast of the Olympic Games.

사실상, 오늘날에는 사람들은 근거리에서는 생중계 이벤트의 관객으로서, 원거리에서는 생방송 프로그램의 청중으로서 자신들이 청취하는 내용에 대한 더 많은 정보를 검색한다.In fact, today people search for more information about what they hear as audiences of live events at close range and audiences of live shows at a distance.

· 소비자는 광고되는 제품과 연관된 특별 서비스에 대한 액세스를 갖고 싶어한다.The consumer wants to have access to special services associated with the product being advertised.

· 매체 제공자는 소비자, 보다 구체적으로는 생중계 텔레비전 또는 라디오 프로그램의 청중에게 제공되는 서비스의 양과 서비스의 품질 및 정보를 확대시키는 것에 의해 새로운 소득원을 얻고자 기대한다.· Media providers expect to gain new sources of income by expanding the amount of services provided and the quality and information of services provided to consumers, more specifically audiences of live television or radio programs.

· 광고업자들은 새롭고 더욱 효과적인 형태의 광고를 찾고 있다.Advertisers are looking for new and more effective forms of advertising.

웹 상에서의 온라인 서비스(On-line Services)On-line Services over the Web

라디오 및 텔레비전의 대규모 개발에 무관하게, 월드 와이드 웹(즉 웹(Web)) 상에서 제공되는 온라인 서비스는 현 사회에 급속히 나타나고, 현재 광범위하게 이용할 수 있다. 인터넷 기술을 기반으로 하는 이러한 온라인 서비스는, 대화형 방식(interactive basis)으로 막대한 양의 정보에 대한 액세스를 제공한다. 인터넷은 컴퓨터로 이루어지는 글로벌 네트워크이다. 인터넷은 TCP/IP("Transmission Control Protocol/Internet Protocol")로 지칭되는 언어를 이용하여 다양한 서로 다른 운영 시스템 또는 언어를 기반으로 하는 컴퓨터를 접속시킨다. 인터넷의 규모 및 복잡성이 증가됨에 따라, 사용자가 네트워크 상에서 정보를 찾는 것을 돕기 위한 툴(tools)이 개발되어 왔다. 이러한 툴은 종종 "네비게이터(navigators)" 또 는 "네비게이션 시스템(navigation systems)"으로 지칭된다. 월드 와이드 웹("WWW" 또는 "웹")은 최신의 우수한 네비게이션 시스템이다. 웹은,Regardless of the large scale development of radio and television, online services offered on the World Wide Web (ie, the Web) are rapidly emerging in the present society and are now widely available. Such online services, based on Internet technology, provide access to vast amounts of information on an interactive basis. The Internet is a global network of computers. The Internet uses a language called TCP / IP ("Transmission Control Protocol / Internet Protocol") to connect computers based on a variety of different operating systems or languages. As the size and complexity of the Internet has increased, tools have been developed to help users find information on the network. Such tools are often referred to as "navigators" or "navigation systems". The World Wide Web ("WWW" or "Web") is the latest superior navigation system. The web,

· 인터넷 기반의 네비게이션 시스템이고,Internet based navigation system

· 인터넷에 의한 정보 배포 및 관리 시스템이며,Information distribution and management system through the Internet,

· 웹 상에서 통신하기 위한 동적 포맷(dynamic format)이다.Dynamic format for communicating on the web.

인터넷 및 웹은 수백만의 사용자에게 정보에 액세스 및 교환하게 하고, 서로 간에 통신할 수 있게 함으로써 현 사회를 변화시키고 있다. 화상(images), 텍스트, 음향 및 영상을 통합시키는 것에 의해, 그래픽 사용자 인터페이스(graphical user interface)를 이용하는 웹 상의 사용자는 오늘날 시스템 상의 상이한 컴퓨터와, 상이한 시스템 애플리케이션과, 상이한 정보 포맷을 가지고 예를 들면, 텍스트, 음향 및 그래픽을 포함하는 파일 및 문서를 용이하게 통신할 수 있다. 현재, 웹 상의 온라인 시스템은 예를 들면, 개인용 메시지 서비스, 전자 상거래(electronic commerce), 뉴스, 실시간 게임, 전자 데이터베이스에 대한 액세스, 전자 회보(electronic newsletters), 기업 간 거래(business-to-business transactions), 또는 직업 소개 제공 서비스 등과 같은 다양한 여러 서비스를 사용자에게 제공할 수 있다.The Internet and the Web are changing our society by allowing millions of users to access and exchange information and to communicate with each other. By integrating images, text, sound, and video, users on the web using a graphical user interface may, for example, have different computers on different systems, different system applications, different information formats, for example. Files and documents including text, sound, and graphics can be easily communicated. Currently, online systems on the web include, for example, personal message services, electronic commerce, news, real-time games, access to electronic databases, electronic newsletters, business-to-business transactions. ), Or a variety of services, such as a job placement service.

그러나, 이러한 온라인 서비스를 현재 널리 이용할 수 있다고 해도, 웹 상에서 관련 정보를 검색하고 찾는 것은 힘든 작업으로서, 숙련된 사용자라고 할지라도 때로는 몇 시간이 소요되기도 한다. 분명하게, 웹은 기본적으로 개방된, 다중 포인트 간 네트워크(multi-point to multi-point network)이므로, 각 사용자는 여러 서로 다른 서버로부터 서로 다른 정보를 선택하고 검색할 수 있다. 사실상, 오늘날, 대부분의 웹을 이용한 온라인 상호 작용은, 단지 텍스트 입력을 통해, 예를 들면 URL(Uniform Resource Locator) 명칭을 입력하거나, 검색 도구 상에 키워드를 입력하거나, HTML(Hypertext Markup Language) 문서에 대한 텍스트 하이퍼링크를 활성화시키는 것에 의해서 이루어진다. 가까운 미래에, 시청각적 인터페이스(예를 들면 음성형 인터페이스(human speech interfaces), 웹-전화 결합체(Web-phone integration) 등)의 개발에 따라 온라인 환경 내에서 텍스트 입력의 중요성이 더욱 더 감소될지라도, 웹에서의 대량성(massiveness), 조직성의 부재(lack of organization), 및 무작위성에 기인하여 웹이 사용자가 사용하기에 용이하지 않게 될 만한 좋은 기회가 존재한다. 간단히 말해서, 웹 내에서는, 순서나 방향이 존재하지 않는다. 정보는 대부분 찾기 어려운 상태로 유지되고, 예측 가능한 미래에 원하는 맥락 내에 속하는 원하는 정보를 찾는 것은 더욱 어려운 작업일 것이다.However, even though such online services are now widely available, searching and finding relevant information on the Web can be a daunting task, even for experienced users, sometimes even hours. Clearly, the Web is basically an open, multi-point to multi-point network, so that each user can select and retrieve different information from different servers. In fact, today, most web-based interactions use only textual input, such as entering a Uniform Resource Locator (URL) name, entering keywords on a search tool, or Hypertext Markup Language (HTML) documents. This is done by activating the text hyperlink for. In the near future, the development of audiovisual interfaces (e.g., human speech interfaces, Web-phone integration, etc.) may further reduce the importance of text input within the online environment. There is a good chance that the web will not be easy for users to use due to the massiveness, lack of organization, and randomness in the web. In short, within the web, there is no order or direction. Most of the information remains difficult to find, and finding the desired information that falls within the desired context in the foreseeable future will be more difficult.

생중계 발표에서의 온라인 서비스Online service in live broadcast

다중 포인트 간 웹 네트워크와는 다르게, 청중(발표자와 동일한 위치에 있는 청중 또는 원거리에 위치하는 청중, 즉 라디오 또는 텔레비전 방송국을 통해 액세스하는 청중)에 대한 생중계 발표는 기본적으로 단일 송신기(emitter)로부터 다수의 수신기로의 통신이다. 모든 청중은 발표자의 근거리에 있던지, 방송국을 통해 원거리에 위치하던지 간에 동일한 컨텐츠를 수신한다.Unlike multi-point-to-point web networks, live announcements to an audience (audience in the same location as the presenter or remotely located audience, such as those accessed through radio or television stations) are essentially from a single emitter. Communication to the receiver. All audiences receive the same content, whether close to the presenter or remote from the broadcaster.

따라서, 웹 상에서 액세스될 수 있는 것과 동일한 온라인 서비스를 제공하기 위해서, 생중계 발표에서의 첫 번째 문제는, 단일 정보원으로부터 다수의 수신기로, 제공자로부터 다수의 청중에게로 정보가 동일 방향으로 연속적인 흐름을 유지해야 한다는 것이다. 청중과의 임의의 정보 교환이 존재하지 않는다면 통신 흐름은 한 방향으로 한정된다. 사용자들은 수신된 음성 정보(oral information)와 직접 대화하여 추가적인 정보 또는 서비스에 액세스할 수 없다.Thus, in order to provide the same online service as can be accessed on the web, the first problem in live announcements is that information flows in the same direction from a single source to multiple receivers and from a provider to multiple audiences. It must be maintained. If there is no exchange of information with the audience, the communication flow is confined in one direction. Users cannot talk directly to received oral information to access additional information or services.

또한, 청중이 생중계 발표를 청취할 때, 청중에게 있어서의 문제는 관심이 가는 주제를 선택한 다음, 선택된 주제와 관련된 멀티미디어 정보 또는 서비스에 액세스(예를 들면, 웹에서)할 네트워크 어드레스(즉, URL)를 식별하는 것이다. 오늘날까지, 이러한 문제는 부분적으로 해결되었다.In addition, when an audience listens to a live presentation, the problem for the audience is to select a topic of interest and then access a network address (i.e. on the web) to access multimedia information or services related to the selected topic (i.e., URL). ). To this day, these problems have been partially solved.

음성 또는 라디오 정보에게 웹 유사 기능을 제공하기 위한 해결책은, 정보(예를 들면, URL 등)를 송신되는 방송 음향 신호에 포함시키거나 별도의 채널(동시 방송(simultcast))에 포함시키는 것이다. 이러한 시스템의 예는, "Apparatus and method for initiating a transaction having acoustic data receiver that filters human voice"라는 명칭의 미국 특허 제 6,125,172 호, "Method for controlling a computer with an audio signal"라는 명칭의 미국 특허 제 6,098,106 호, "Network linking method using steganographically embedded data objects"라는 명칭의 미국 특허 제 5,841,978 호, "System, method and device for automatic capture of Internet access information in a broadcast signal for use by an Internet access device"라는 명칭의 미국 특허 제 5,832,223 호, "Media online services access via address embedded in video or audio program" 라는 명칭의 미국 특허 제 5,761,606 호, "Method for encoding and broadcasting information about live event using computer pattern matching techniques"라는 제목의 미국 특허 제 5,189,630 호, "Receiver apparatus and methods for identifying broadcast audio program selections in a radio broadcast system"라는 제목의 미국 특허 제 5,119,507 호 또는 "Synchronized presentation of television programming and web content"라는 명칭의 미국 특허 제 6,061,719 호에 개시되어 있다.A solution for providing web-like functionality for voice or radio information is to include information (e.g., URLs, etc.) in the transmitted broadcast sound signal or in a separate channel (simultcast). An example of such a system is US Pat. No. 6,125,172, entitled "Apparatus and method for initiating a transaction having acoustic data receiver that filters human voice," US Pat. No. 6,098,106, entitled "Method for controlling a computer with an audio signal." US Patent No. 5,841,978, entitled "System, method and device for automatic capture of Internet access information in a broadcast signal for use by an Internet access device." Patent No. 5,832,223, U.S. Patent No. 5,761,606, entitled "Media online services access via address embedded in video or audio program," U.S. Patent No. 5 entitled "Method for encoding and broadcasting information about live event using computer pattern matching techniques" 5,189,630, "Receiver apparatus and methods for identifying broadcast audio program selections in a radio broadcast syste m, US Pat. No. 5,119,507 or US Pat. No. 6,061,719 entitled "Synchronized presentation of television programming and web content."

이러한 특허에서 설명된 시스템 및 방법은, 주요 프로그램의 전송과 동시에, 동일한 음향 또는 영상 신호에 대해 인코딩, 삽입 또는 변조되거나, 별도의 채널 상에서 전송되는 상보적 정보(예를 들면 URL 등)의 전송을 필요로 한다. 라디오 또는 텔레비전 방송국은, 음향 신호와 함께 이러한 상보적 정보를 인코딩, 변조 및 송신하는 수단을 포함해야 한다. 라디오의 청중 또는 텔레비전 시청자는 이러한 정보를 복원하기 위한 특수한 수신기 및 디코더 회로를 구비해야 한다.The systems and methods described in these patents allow for the transmission of complementary information (e.g. URLs, etc.) that are encoded, inserted or modulated on the same sound or video signal, or transmitted on separate channels, simultaneously with the transmission of the main program. in need. The radio or television station must include means for encoding, modulating and transmitting this complementary information along with the acoustic signal. Audience or television viewers of the radio must have special receiver and decoder circuitry to recover this information.

상술된 장치와 무관하게, 시스템은 청중이 관심이 가는 주제(즉, 키워드 또는 문장)를 "사전 선택(pre-select)"할 수 있게 하고, 이러한 주제와 사전 지정된 네트워크 어드레스(즉, URL)를 연관시킬 수 있도록 개발되어 왔다. 이러한 사전 지정된 네트워크 어드레스는 사전 선택된 주제와 관련된 멀티미디어 정보 또는 서비스에 액세스하기 위해 사용된다. 일반적으로 말해서, 이러한 시스템은 모두 음성 인식 기법을 기반으로 한다. 이러한 기법은 특정 사운드의 인식에 응답하여 특정 동작을 수행하게 하는 키워드(즉, 선택된 단어 또는 문장)를 식별하는 데 사용된다. 이러한 시스템의 예는 이하의 특허에서 확인할 수 있다.Regardless of the device described above, the system allows the audience to “pre-select” a topic of interest (ie, a keyword or sentence), and select such a topic and a predetermined network address (ie, a URL). It has been developed to be relevant. This pre-designated network address is used to access multimedia information or services related to the preselected subject. Generally speaking, these systems are all based on speech recognition techniques. This technique is used to identify keywords (ie, selected words or sentences) that allow a particular action to be performed in response to recognition of a particular sound. Examples of such systems can be found in the patents below.

"Keyword listening device"라는 제목의 미국 특허 제 5,946,050 호는 키워드 청취 장치(keyword listening device)를 이용하여, 비교적 제한된 세트의 키워드가 저장되어 있는 방송 신호의 음향 부분을 모니터링하는 방법 및 시스템에 관해 개시한다. 키워드 청취 장치는 이러한 키워드 모두에 대한 방송 신호를 모니터링한다. 임의의 하나 이상의 키워드를 인식하면, 방송 음향 신호는 소정 시간 주기로 기록되고, 다음에 완전히 분석된다. 분석 이후에, 또한 기록되고 분석된 방송 음향 신호에 따라서, 특정 어드레스에서 외부 네트워크로의 접속 또는 비디오 카세트 레코더(video cassette recorder)의 제어 등과 같은 다수의 서로 다른 기능을 수행할 수 있다.US Patent 5,946,050 entitled "Keyword listening device" discloses a method and system for monitoring the acoustic portion of a broadcast signal in which a relatively limited set of keywords is stored using a keyword listening device. . The keyword listening device monitors the broadcast signal for all of these keywords. Upon recognizing any one or more of the keywords, the broadcast sound signal is recorded at predetermined time periods and then fully analyzed. After the analysis, it is also possible, depending on the recorded and analyzed broadcast sound signal, to perform a number of different functions, such as connecting to an external network at a specific address or controlling a video cassette recorder.

"Automatic recognition of audio information in a broadcast program"라는 명칭의 미국 특허 제 6,011,854 호는 하나 이상의 무선국에 걸쳐 방송되는 정보 리포트 또는 업데이트(교통, 날씨, 시간, 스포츠, 뉴스 등)를 검색하는 음향 처리 시스템에 관해 개시하였다. 검색은 사용자에 의해 사전 선택되고, 음향 처리 시스템에 입력된 적어도 하나의 키워드(원하는 리포트에 따라서, "교통", "날씨", "시간", "스포츠", "뉴스" 등)에 기초한다. 음향 처리 시스템에 의해 사용된 음성 인식 소프트웨어는 요청된 정보 리포트를 위해서 라디오 방송국을 스캐닝하지만, 사용자는 다른 음향원(CD, 테이프, 다른 라디오 방송국 등)을 청취할 수 있는데, 이 경우에 이 음향원으로부터의 정보 컨텐츠를 모니터링(즉, 청취)할 필요가 없다. 입력된 키워드에 기초하여 라디오 방송에 사용되는 요청된 정보 리포트가 검출되면, 음향 처리 시스템은 그 음향 출력을 원하는 방송을 송신하는 라디오 방송국으로 스위칭하여, 사용자가 교통, 날씨, 시간, 스포츠 및/또는 뉴스 리포트 또는 업데이트를 적시에 편리하게 청취할 수 있게 한다.U.S. Patent 6,011,854, entitled "Automatic recognition of audio information in a broadcast program," describes a sound processing system that retrieves information reports or updates (traffic, weather, time, sports, news, etc.) that are broadcast across one or more stations. Disclosed. The search is preselected by the user and is based on at least one keyword ("traffic", "weather", "time", "sports", "news", etc., depending on the desired report). The speech recognition software used by the sound processing system scans the radio station for the requested information report, but the user can listen to other sound sources (CDs, tapes, other radio stations, etc.), in which case this sound source There is no need to monitor (ie, listen to) the content of information from. When a requested information report used for a radio broadcast is detected based on the entered keyword, the sound processing system switches the sound output to a radio station that transmits the desired broadcast, so that the user can select traffic, weather, time, sports and / or Makes timely and convenient listening to news reports or updates.

"Broadcast speech recognition system for keyword monitoring"라는 명칭의 미국 특허 제 6,332,120 호는 관심 대상이 되는 정보를 위하여 자동적으로 방송 음향을 모니터링하는 시스템에 관해 개시하였다. 이 시스템은 관심 대상이 되는 키워드의 어휘를 저장하기 위한 메모리를 구비하는 컴퓨터 프로세서와, 음향 방송을 수신하는 음향 수신기와, 음향 수신기 및 컴퓨터 프로세서와 연결되어 키워드 중의 하나가 수신된 음향 세그먼트에서 나타나는 시간을 검출하는 음성 인식 시스템을 포함한다. 컴퓨터 프로세서와 연결되고 키워드의 검출에 응답하는 보고 생성기(report generator)는 검출된 키워드 및 그 내용과 관련된 세부 사항에 대한 보고를 생성한다.US Pat. No. 6,332,120, entitled "Broadcast speech recognition system for keyword monitoring," discloses a system for automatically monitoring broadcast sound for information of interest. The system includes a computer processor having a memory for storing a vocabulary of a keyword of interest, an acoustic receiver for receiving acoustic broadcasts, and a time associated with the acoustic receiver and the computer processor in which one of the keywords appears in the received acoustic segment. It includes a speech recognition system for detecting the. A report generator coupled with the computer processor and responsive to the detection of the keyword generates a report on the detected keyword and details related to its content.

앞서 상술된 시스템에서 음향 신호에 내장하여 (또는 주 프로그램의 재전송과 동시에 전송되는 2차 신호로) 상보적 정보를 전송할 필요가 없다고 해도, 청중은 데이터 스트림 내에서 하이퍼링크 용어 발생을 검출하기 위한 음성 인식 기능을 가진 수신기를 구비해야 한다.Although the above-mentioned system does not need to transmit complementary information embedded in an acoustic signal (or as a secondary signal transmitted at the same time as retransmission of the main program), the audience may still have a voice to detect the occurrence of a hyperlink term in the data stream. It must have a receiver with a recognition function.

발표 처리의 분야에서, 음성 데이터의 스트림 내에서 단어 또는 문장의 발생을 식별하는 능력은 통상적으로 "단어 추출(word spotting)"로 지칭된다. 음향 단어 추출(audio wordspotting)의 목적은 사전의 수동적 편집을 필요로 하지 않으면서 디지털화된 연속 발표 스트림 내에서 검색어의 범위를 식별하기 위한 것이다. 임의의 발표자에 의해 표명되어야 하는 생중계 발표를 검색하고 인덱싱(indexing)하는 것은 특히 문제가 발생될 여지가 많다. 이는 대체로, 기존의 자동적 음성 인식 기술의 제한된 기능에 기인한다. 상술된 시스템에서, 단어 추출 작업은 발표자에 무관한 방식으로, 청중 측에서 이루어지고, 인식할 데이터 이외의 음성 데이터를 이용하도록 조정된 발표 모델을 사용한다는 것을 유의하는 것은 중요하다.In the field of presentation processing, the ability to identify the occurrence of a word or sentence in a stream of speech data is commonly referred to as "word spotting." The purpose of audio wordspotting is to identify a range of search terms within a digitized continuous presentation stream without requiring manual editing of the dictionary. Searching and indexing live broadcasts that must be voiced by any presenter is particularly problematic. This is largely due to the limited functionality of existing automatic speech recognition technology. In the system described above, it is important to note that the word extraction operation is done at the audience side in a manner that is independent of the presenter and uses a presentation model tuned to use speech data other than the data to be recognized.

사실상, 모든 시스템에서의 기본적인 문제점은, 알려지지 않았거나 일반적인 발표 스타일, 어휘, 노이즈 레벨 및 언어 모델에 기초하여, 발표자에 무관하게 연속적인 방식으로 "단어 추출"(즉, 사전 지정된 키워드 또는 용어의 식별)을 수행하는 데 있어서, 최신 기술 수준의 음성 인식 기술의 작용을 신뢰할 수 없다는 데 있다.In fact, the fundamental problem with any system is that "word extraction" (ie, identification of pre-specified keywords or terms) in a continuous manner independent of the presenter, based on unknown or common presentation styles, vocabulary, noise levels and language models. ), The operation of the state-of-the-art speech recognition technology is not reliable.

상술된 설명에서 표현된 바와 같이, 최근 몇 년간 동안에도, 사용자와의 대화 레벨을 증가 및 향상시키고, 더 많은 정보, 학습 또는 오락 기회(예를 들면, 대화형 텔레비전, 웹TV(WebTV) 등)를 제공하기 위한 대화형 시스템(interactive systems)이 개발되어 왔으나, 웹 상에서 찾을 수 있는 것과 같은 중요한 정보원은 생중계 발표(예를 들면, 라디오 또는 텔레비전 방송으로부터 수신된 생중계 컨퍼런스 또는 생중계 인터뷰 등)의 청중에게 있어서 여전히 액세스 불가능하다.As expressed in the description above, in recent years, even in recent years, the level of conversation with the user has been increased and improved, and more information, learning or entertainment opportunities (eg, interactive television, WebTV, etc.) While interactive systems have been developed to provide the services, important sources of information, such as those that can be found on the Web, are available to audiences of live presentations (for example, live conferences or live interviews received from radio or television broadcasts). Is still inaccessible.

따라서, 오늘날에는, 생중계 발표를 청취하는(예를 들면, 그 대신에 생중계 방송 프로그램을 수신하는) 청중이 상보적 정보를 선택하고 액세스할 수 잇게 하는 편리하고, 보편적이며, 용이한 메커니즘을 제공할 필요가 있다.Thus, today, it would provide a convenient, universal, easy mechanism for allowing an audience to listen to live presentations (eg, instead of receiving live broadcast programs) to select and access complementary information. There is a need.

또한, 종래의 일방적 방송 신호(one-way broadcast signals)에서와 같이 이 러한 하이퍼링크를 포함시킬 필요없이, 또한 보다 일반적으로는 이러한 하이퍼링크를 물리적으로 전송할 필요가 없고 종래의 송신기 또는 수신기를 수정할 필요없이, 생방송 프로그램의 발표자 및 프로듀서가 발표 동안에(예를 들면, 컨퍼런스 진행 동안에 또는 생중계 라디오 또는 텔레비전 프로그램 동안에) 발음될 선택된 용어(일반적으로 선택된 구어 발언(utterances), 단어 또는 문장)로부터 웹 상의 관련 데이터로의 하이퍼링크를 생성하게 할 필요성이 존재한다.In addition, there is no need to include these hyperlinks as in conventional one-way broadcast signals, and more generally there is no need to physically transmit such hyperlinks and modify the conventional transmitter or receiver. Without, the presenter and producer of a live program will have relevant data on the web from selected terms (typically selected utterances, words or sentences) to be pronounced during the presentation (eg, during a conference or during a live radio or television program). There is a need to allow hyperlinks to be created.

본 발명의 총체적인 목적은 생중계 발표 또는 생중계 라디오 또는 텔레비전 방송 프로그램 등과 같은 음향 정보를, 상보적 정보 또는 해당 음향 정보에 관련된 서비스를 가지고 강화시키는 것이다.The overall object of the present invention is to enhance sound information such as live broadcast announcements or live broadcast radio or television broadcast programs with complementary information or services related to the sound information.

본 발명의 다른 목적은, 발표 진행 동안에 발표자에 의해 발음될 선택된 용어 또는 단어와, 이러한 선택된 용어에 관련된 상보적 정보 사이에 하이퍼링크를 생성하는 것이다.Another object of the present invention is to create a hyperlink between a selected term or word to be pronounced by the presenter during the presentation process and complementary information related to the selected term.

본 발명의 또 다른 목적은, 발표 진행 동안에 발표자에 의해 하이퍼링크 용어가 발음될 때 해당 하이퍼링크 용어를 식별하고, 이러한 식별된 용어와 관련된 하이퍼링크를 활성화시키는 것이다.Another object of the present invention is to identify hyperlink terms when they are pronounced by the presenter during the presentation process and to activate hyperlinks associated with these identified terms.

본 발명의 다른 목적은 생중계 발표 진행 동안에, 청중이 자신의 관심을 끄는 주제와 관련된 하나 이상의 용어를 선택하게 하고, 상기 발표 직후 또는 시간이 경과한 후에, 이전에 선택된 주제에 관련된 정보에 액세스할 수 있게 하는 것이다.Another object of the present invention is to allow an audience to select one or more terms related to a subject of interest to them during the live presentation, and to access information related to the previously selected subject immediately after or after the announcement. It is to be.

본 발명의 또 다른 목적은, 생중계 발표의 청중이 발표 진행 동안에, 요구되는 장비의 복잡도를 최소화하면서, 청중측에서의 최소의 노력 및 참여에 의해 자신의 주의를 끄는 주제에 관련된 정보에 액세스할 수 있게 하는 것이다.It is yet another object of the present invention to allow an audience of a live broadcast to access information related to the subject matter that attracts their attention with minimal effort and participation on the part of the audience, while minimizing the complexity of the equipment required during the presentation. will be.

본 발명은 청구항의 독립항에 정의된 바와 같이, 발표 직후 또는 시간이 경과한 후에, 생중계 발표의 청중이 발표 동안에 발음된 용어와 관련된 정보를 액세스할 수 있게 하는 시스템, 방법 및 컴퓨터 프로그램에 관련된다.The present invention relates to a system, method and computer program that, as defined in the independent claims of the claims, immediately after the announcement or after a period of time, allows the audience of the live announcement to access information related to the pronounced term during the announcement.

이 시스템은 발표 진행 동안에 발표자에 의해 발음될 가능성이 있는 선택된 용어 또는 단어와 하이퍼링크(즉, URL)를 연관시킨다. 발표자 디바이스(즉, 마이크가 접속되어 있는 컴퓨터 시스템) 상에서 작동되는 음성 인식 시스템은, 발표 동안에 상기 하이퍼링크 용어 중의 하나에 대한 발표자의 발음을 인식(즉, 단어 추출)하고, 각각의 인식된 하이퍼링크 용어가 발음된 시간을 기록한다.The system associates hyperlinks (ie, URLs) with selected terms or words that are likely to be pronounced by the presenter during the presentation process. A speech recognition system operating on a presenter device (i.e., a computer system with a microphone connected) recognizes the presenter's pronunciation (ie, word extraction) for one of the hyperlink terms during the presentation, and recognizes each recognized hyperlink. Record the time the term is pronounced.

또한, 이 시스템은 동일한 세계 표준 시각(universal time)에 따라서 발표자 디바이스와 수 개의 청중 디바이스(예를 들면, 워크스테이션, 휴대용 컴퓨터, PDA(Personal Digital Assistants), 스마트 폰(smart phones) 또는 그 외에 다른 종류의 휴대용(hand-held) 컴퓨팅 디바이스 등)를 동기화(synchronization)하여, 발표자와 청중의 상대적 위치에 무관하게 발표자에 의해 송신되고, 청중에 의해 수신되는 정보의 흐름을 언제나 동기화하는 것을 기반으로 한다. 발표 동안에 청중이 관심이 가는 주제를 인식할 때마다, 청중은 즉시 청중 디바이스 상에 마련된 키를 단순히 누르는 것에 의해 해당 주제를 선택한다. 청중이 주제를 선택한 세계 표준 시각은 청중 디바이스 내에 저장된다.In addition, the system is based on the same universal time, with the presenter device and several audience devices (e.g., workstations, handheld computers, Personal Digital Assistants, PDAs or other smart phones). Kind of hand-held computing devices, etc.), based on always synchronizing the flow of information transmitted by the presenter and received by the audience, regardless of the relative location of the presenter and the audience. . Whenever an audience recognizes a topic of interest during the presentation, the audience immediately selects the topic by simply pressing a key provided on the audience device. The world standard time at which the audience has selected a topic is stored in the audience device.

본 발명의 바람직한 실시예에서, 발표자 디바이스와 청중 디바이스 사이의 동기화는 GPS-시간(Global Positioning System Time), GLONASS(Global Orbiting Navigational Satellite System) 또는 위성 시스템을 기반으로 하는 다른 적절한 표준 시각 등과 같은 세계 표준 시각을 참조하는 것에 의해 이루어진다. GPS 또는 GLONASS 수신기는 발표자 디바이스에 집적되거나 접속된다. GPS 또는 GLONASS 수신기는 각각의 청중 디바이스에 집적되거나 접속된다. 각각의 청중 디바이스는 방송되는 발표를 수신하기 위해 청중에 의해 사용되는 라디오 또는 텔레비전 세트로부터 독립되고 분리되어 있다.In a preferred embodiment of the present invention, the synchronization between the presenter device and the audience device is a global standard such as Global Positioning System Time (GPS), Global Orbiting Navigational Satellite System (GLONASS), or any other suitable standard time based on a satellite system. By referencing the time. The GPS or GLONASS receiver is integrated or connected to the presenter device. A GPS or GLONASS receiver is integrated or connected to each audience device. Each audience device is independent and separate from the radio or television set used by the audience to receive the broadcasted announcement.

보다 구체적으로, 본 발명은 하나 이상의 포일을 프리젠테이션하는 동안에 발표자 디바이스로부터 발표 하이퍼링크 타임 테이블(Speech Hyperlink-Time Table)을 작성하는 시스템, 방법 및 컴퓨터 프로그램에 관하여 개시하는 것으로, 상기 발표 하이퍼링크 타임 테이블은 한 명 이상의 청중에 의해 액세스될 수 있다. 상술된 방법은,More specifically, the present invention discloses a system, method, and computer program for creating a Speech Hyperlink-Time Table from a presenter device while presenting one or more foils, wherein the Presentation Hyperlink Time A table can be accessed by one or more audiences. The method described above,

· 발표자에 의해 발음될 하나 이상의 사전 정의된 하이퍼링크 용어(hyperlinked terms)를 식별하는 수단과, 각각의 하나 이상의 사전 정의된 하이퍼링크 용어에 대한 정보를 탐색(locating)하고 액세스하는 수단을 포함하는 발표 하이퍼링크 테이블을 탐색하고 액세스하는 탐색 및 액세스 단계와,A presentation comprising means for identifying one or more predefined hyperlinked terms to be pronounced by the presenter, and means for locating and accessing information about each of the one or more predefined hyperlinked terms Navigation and access steps to navigate and access hyperlinked tables,

· 발표 하이퍼링크-타임 테이블의 탐색 및 액세스 수단을 검색하는 단계와,Searching for means of searching and accessing the presentation hyperlink-time table;

발표 진행 동안에,During the presentation,

· 발표자가 하이퍼링크 용어를 발음할 때, 발표자 디바이스에 접속된 음성 인식 시스템을 이용하여 발표 하이퍼링크 테이블에서 사전 정의된 하이퍼링크 용어를 인식하는 단계와,When the presenter pronounces the hyperlink term, recognizing a predefined hyperlink term in the presentation hyperlink table using a speech recognition system connected to the presenter device,

각각의 인식된 하이퍼링크 용어에 대하여,For each recognized hyperlink term,

· 발표자가 인식된 하이퍼링크 용어를 발음하는 것에 대응하는 세계 표준 시각(universal-time)을 결정하는 단계와,Determining a universal-time corresponding to the speaker presenting a recognized hyperlink term,

· 발표 하이퍼링크-타임 테이블에 대한 새로운 기록(new record)을 생성하는 단계와,Creating a new record for the presentation hyperlink-time table,

· 발표자가 상기 인식된 하이퍼링크 용어를 발음하는 것에 대응하는 세계 표준 시각과, 발표 하이퍼링크 테이블로부터 검색된, 인식된 하이퍼링크 용어를 식별하는 수단과, 발표 하이퍼링크 테이블로부터 검색된, 인식된 하이퍼링크 용어에 관련된 정보를 탐색 및 액세스하는 수단을 상기 새로운 기록에 복사하는 단계를 포함한다.A world standard time corresponding to the presenter pronounces the recognized hyperlink term, means for identifying a recognized hyperlink term retrieved from the presentation hyperlink table, and a recognized hyperlink term retrieved from the presentation hyperlink table Copying means for retrieving and accessing information related to the new record.

또한, 본 발명은 발표가 진행되는 동안에, 발표자에 의해 발음된 하나 이상의 하이퍼링크 용어를 청중 디바이스로부터 선택하고, 소정 시간이 경과한 후에 하나 이상의 선택된 하이퍼링크 용어 각각에 대해 관련된 정보에 액세스하는 시스템, 방법 및 컴퓨터 프로그램에 관해서도 개시한다. 이 방법은,The present invention also provides a system for selecting one or more hyperlink terms pronounced by a presenter from an audience device during a presentation and accessing relevant information for each of the one or more selected hyperlink terms after a predetermined time has elapsed; A method and a computer program are also disclosed. This way,

· 발표의 진행 동안에 발표자에 의해 당시에 발음된 용어를 선택하는 선택 명령이 수신될 때마다, 당시의 세계 표준 시각을 결정하고, 선택 하이퍼링크 타임 테이블 내에 당시의 세계 표준 시각을 기록하는 단계와,Each time a selection command is received by the presenter to select the term pronounced at that time during the course of the presentation, determining the current world standard time and recording the current world standard time in the selection hyperlink time table;

· 하나 이상의 청중 디바이스에 의해 액세스 가능한 발표 하이퍼링크 타임 테이블에 액세스하는 단계-발표 하이퍼링크 타임 테이블은, 발표자에 의해 발음된 복수의 사전 정의된 하이퍼링크 용어 각각에 대하여, 사전 정의된 하이퍼링크 용어가 발음될 때에 해당되는 세계 표준 시각 간격과, 사전 정의된 하이퍼링크 용어의 식별 수단과, 사전 정의된 하이퍼링크 용어와 관련된 정보를 탐색하고 액세스하는 수단을 포함함-와,Accessing a presentation hyperlink time table accessible by one or more of the audience devices—the presentation hyperlink time table includes a predefined hyperlink term for each of a plurality of predefined hyperlink terms pronounced by the presenter. Includes global standard time intervals as they are pronounced, means for identifying predefined hyperlink terms, and means for searching and accessing information related to predefined hyperlink terms;

선택 하이퍼링크 타임 테이블 내에 기록된 각각의 세계 표준 시각에 대하여,For each world standard time recorded in the selection hyperlink time table,

· 기록된 세계 표준 시각에서 발음된 선택된 하이퍼링크 용어를 발표 하이퍼링크 타임 테이블 내에서 식별하는 단계와,Identifying within the presentation hyperlink time table the selected hyperlink term pronounced at recorded world standard time;

· 발표 하이퍼링크 타임 테이블로부터, 선택된 하이퍼링크 용어를 식별하고, 탐색 및 액세스하는 수단을 검색하는 단계와,Retrieving, from the presentation hyperlink time table, means for identifying, navigating and accessing selected hyperlink terms;

· 위의 검색 단계에서 검색된, 선택된 하이퍼링크 용어를 식별하고, 선택된 하이퍼링크 용어에 관련된 정보를 탐색 및 액세스하는 수단을 선택 하이퍼링크-타임 테이블 내에 저장하는 단계를 포함한다.Identifying a selected hyperlink term, retrieved in the search step above, and storing means in the selection hyperlink-time table for searching and accessing information related to the selected hyperlink term.

본 발명의 다른 실시예는 첨부된 종속항에 제시되어 있다.Other embodiments of the invention are set forth in the appended dependent claims.

이하의 명세서, 청구항 및 도면을 참조함으로써, 본 발명의 상술된 목적, 특성 및 이점과 그 외의 목적, 특성 및 이점을 더 잘 이해할 수 있을 것이다.By referring to the following specification, claims and drawings, the above-described objects, features and advantages of the present invention and other objects, features and advantages will be better understood.

본 발명의 특성을 갖는 것으로 생각되는 새로운 특징 및 본 발명의 특징은 첨부된 청구항에 제시되어 있다. 그러나, 본 발명 그 자체뿐만 아니라 바람직한 사용 모드, 다른 목적 및 그 이점은 예시적이고 세부적인 실시예에 대한 이하의 상세한 설명을 첨부된 도면과 함께 참조하는 것에 의해 가장 잘 이해될 수 있을 것이다.New features that are believed to have the properties of the invention and features of the invention are set forth in the appended claims. However, the preferred mode of use, other objects and advantages thereof, as well as the present invention itself, will be best understood by reference to the following detailed description of exemplary and detailed embodiments in conjunction with the accompanying drawings.

도 1은 컨퍼런스 등과 같은 생중계 발표의 청중이 관심이 가는 주제를 인식하는 것을 도시하는 도면.1 illustrates that an audience of a live broadcast presentation, such as a conference, recognizes a subject of interest.

도 2는 생중계 라디오 또는 텔레비전 프로그램을 청취할 때, 청중이 관심 가는 주제를 인식하는 것을 도시하는 도면.FIG. 2 shows that when listening to a live radio or television program, the audience perceives a subject of interest.

도 3은 본 발명에 따라서, 발표자 워크스테이션 및 청중 디바이스가 동일한 세계 표준 시각에 따라서 동기화되는 것과, 발표자 디바이스 상에서 작동되는 음성 인식 시스템이 발표자에 의해 발음된 하이퍼링크 용어를 인식하는 것을 도시하는 도면.FIG. 3 illustrates that, in accordance with the present invention, the presenter workstation and audience device are synchronized according to the same world standard time, and a speech recognition system operating on the presenter device recognizes hyperlinked terms pronounced by the presenter.

도 4는 본 발명에 따라서, 발표자가 미리 발표문을 준비하는 것을 나타내는 도면.4 is a diagram showing that a presenter prepares a presentation in advance according to the present invention.

도 5는 본 발명에 따라서, 발표자가 멀티미디어 정보 또는 서비스에 대한 하이퍼링크(즉, 연관된 URL)를 생성하기 위해 발표문, 단어 또는 용어를 선택하는 것을 도시하는 도면.5 illustrates, in accordance with the present invention, a presenter selecting a statement, word or term to generate a hyperlink (ie, an associated URL) to multimedia information or service.

도 6은 본 발명에 따라서, 발표자가 각각의 하이퍼링크 용어에 대해 멀티미디어 정보 또는 서비스에 액세스하게 하는 어드레스를 연관시키는 발표 하이퍼링크 테이블을 생성하는 것을 도시하는 도면.FIG. 6 illustrates generating a presentation hyperlink table that associates an address allowing a presenter to access multimedia information or services for each hyperlink term, in accordance with the present invention.

도 7은 본 발명에 따라서, 발표 이전에 발표자가 음성 인식 시스템을 트레이닝시키는 것을 도시하는 도면.7 illustrates a presenter training a speech recognition system prior to presentation, in accordance with the present invention.

도 8은 본 발명에 따라서, 발표 동안에, 발표자 워크스테이션 상에서 작동하는 음성 인식 시스템이 발표자가 하이퍼링크 용어를 발음할 때 해당 하이퍼링크 용어를 인식(즉, 단어 추출)하는 것을 도시하는 도면.8 illustrates, during a presentation, a speech recognition system operating on a presenter workstation that recognizes the hyperlink term (ie, word extraction) when the presenter pronounces the hyperlink term.

도 9는 본 발명에 따라서, 음성 인식 시스템에 의해 하이퍼링크 용어가 인식(단어 추출)될 때, 인식된 용어, 연관된 어드레스 및 인식된 시간에 대한 세계 표준 시각을 통신 네트워크(예를 들면, 인터넷)에 접속된 발표 서버 상에 위치된 발표 하이퍼링크 타임 테이블 내에 저장하는 것을 도시하는 도면.9 illustrates a communication network (e.g., the Internet) when a hyperlink term is recognized (word extracted) by a speech recognition system, in accordance with the present invention, a world standard time for the recognized term, the associated address and the recognized time. A diagram illustrating storing in a presentation hyperlink time table located on a presentation server connected to the system.

도 10은 본 발명에 따라서 발표 서버 상에 저장된 발표 하이퍼링크 타임 테이블의 일례를 도시하는 도면.10 illustrates an example of a presentation hyperlink time table stored on a presentation server in accordance with the present invention.

도 11은 본 발명에 따라서, 발표 진행 동안에 관심이 가는 주제에 대응되는 용어를 인식한 청중이, 휴대용 컴퓨터 상에 마련된 키를 누르는 것에 의해 간단하게 해당 용어를 선택하는 것을 도시하는 도면.FIG. 11 is a diagram illustrating that, in accordance with the present invention, an audience recognizing a term corresponding to a subject of interest during a presentation proceeds to simply select the term by pressing a key provided on the portable computer.

도 12는 본 발명에 따라서, 발표 동안에 청중이 관심이 가는 용어를 선택한 세계 표준 시각을 청중 디바이스 내에 위치된 선택 하이퍼링크 타임 테이블 내에 저장하는 것을 도시하는 도면.FIG. 12 illustrates storing, in accordance with the present invention, a world standard time of day in which an audience selected a term of interest during a presentation in a selection hyperlink time table located within the audience device.

도 13은 본 발명에 따라서, 청중 디바이스를 통신 네트워크에 접속시키는 것에 의해, 청중이 발표 서버 상에 저장된 발표 하이퍼링크 타임 테이블에 포함된 정보를 이용하여 자신의 워크스테이션 내에 위치된 선택 하이퍼링크 타임 테이블을 업데이트하는 것을 도시하는 도면.13 is an optional hyperlink time table located within its workstation using information contained in a presentation hyperlink time table stored on a presentation server by connecting the audience device to a communication network, in accordance with the present invention. A diagram illustrating updating.

도 14는 본 발명에 따라서, 연관된 URL(Uniform resource Locators)과 함께 하이퍼링크 용어가 식별되고, 발표 서버 상에 저장된 발표 하이퍼링크 타임 테이블로부터 청중 디바이스 상의 선택 하이퍼링크 타임 테이블로 복사되는 것을 도시하는 도면.FIG. 14 illustrates that hyperlink terms, together with associated Uniform Resource Locators (URLs), are identified and copied from a presentation hyperlink time table stored on a presentation server to a selected hyperlink time table on an audience device, in accordance with the present invention. .

도 15는 본 발명에 따라서, 청중이 업데이트된 선택 하이퍼링크 타임 테이블로부터 (발표 동안에 청중에 의해 선택된 관심이 가는 주제에 대응하는) 하이퍼링크 용어를 선택하고, 해당된 하이퍼링크를 활성화하는 것을 도시하는 도면.FIG. 15 illustrates that, in accordance with the present invention, an audience selects a hyperlink term (corresponding to a subject of interest selected by the audience during the presentation) from an updated selection hyperlink time table and activates the corresponding hyperlink. drawing.

도 16은 본 발명에 따라서, 선택된 용어에 대해 하이퍼링크된 멀티미디어 정보 또는 서비스가 통신 네트워크를 통해 액세스되고, 청중 디바이스 상에서 검색되는 것을 도시하는 도면.FIG. 16 illustrates, in accordance with the present invention, hyperlinked multimedia information or services for selected terms are accessed via a communication network and retrieved on an audience device.

도 17은 본 발명에 따라서 발표 하이퍼링크 테이블을 생성하고, 하이퍼링크 용어를 식별하도록 음성 인식 시스템을 트레이닝시키는 단계를 도시하는 도면.17 illustrates the steps of creating a presentation hyperlink table and training a speech recognition system to identify hyperlink terms in accordance with the present invention.

도 18은 본 발명에 따라서 발표 서버 상에 발표 하이퍼링크 타임 테이블을 생성하고, 발표 동안에 발음된 하이퍼링크 용어를 인식하는 단계를 도시하는 도면.18 illustrates the steps of creating a presentation hyperlink time table on a presentation server and recognizing hyperlinked terms pronounced during presentation in accordance with the present invention.

도 19는 본 발명에 따라서 청중 디바이스 상에 선택 하이퍼링크 타임 테이블을 생성하고, 발표 진행 동안에 관심이 가는 주제를 선택하는 단계를 도시하는 도면.FIG. 19 illustrates generating a selection hyperlink time table on an audience device and selecting a subject of interest during the presentation process in accordance with the present invention.

도 20은 URL(Uniform Resource Locators)을 검색하고, 선택된 하이퍼링크 용어에 관련된 정보 또는 서비스에 액세스하는 단계를 도시하는 도면.20 illustrates the steps of retrieving Uniform Resource Locators (URLs) and accessing information or services related to the selected hyperlink term.

도 1 및 도 2에 도시된 바와 같이, 본 발명은 생중계 이벤트(예를 들면, 컨퍼런스에 대한 참가)(102, 202)의 관객(100) 또는 생방송 프로그램에 참가하는 라디오 청중 또는 텔레비전 시청자(200)가, 발표 즉시 또는 시간이 경과한 후에 자신의 주의 또는 관심을 끄는 하나 이상의 주제(101, 201)를 선택하게 하여, 선택된 주제(103, 203)에 관련된 멀티미디어 정보에 쉽게 액세스하게 하는 시스템 및 방법에 관해 개시한다.As shown in Figures 1 and 2, the present invention is directed to the audience 100 of a live broadcast event (e.g., participation in a conference) 102, 202 or a radio audience or television viewer 200 participating in a live broadcast program. A system and method for selecting one or more of the topics 101, 201 that draw their attention or interest immediately after presentation or over time, thereby providing easy access to multimedia information related to the selected subjects 103, 203. We start about.

도 1은 본 발명에 따른 전형적인 상황을 나타낸다. 생중계 발표(예를 들면, "와인과 건강"이라는 주제에 대한 컨퍼런스)를 시청하는 청중(100)은, 소정의 주제(101)(예를 들면, "레스베라트롤(Resveratrol)"이라는 용어)에 대한 관심을 갖고 있고, 이것에 대해 추가적인 정보를 갖고 싶어할 것이다. 이러한 상황 하에서, 발표 진행 동안에 청중(100)이 주제들(예를 들면, "타닌(Tannis)", "페놀(Phenols)", "레스베라트롤" 등의 용어)을 선택할 수 있게 하고, 발표 즉시 또는 시간이 경과한 후에, 예를 들면 인터넷에 접속된 서버 상에서 선택된 주제에 관련된 정보에 액세스할 수 있게 하는 간단한 메커니즘을 제공할 필요가 있다.1 shows a typical situation according to the invention. An audience 100 watching a live broadcast presentation (eg, a conference on the topic “wine and health”) is interested in a topic 101 (eg, the term “Resveratrol”). You will want to have additional information about it. Under these circumstances, during the presentation process, the audience 100 can select subjects (eg, terms such as "Tannis", "Phenols", "Resveratrol"), and immediately or during the presentation. After this has elapsed, there is a need to provide a simple mechanism for allowing access to information related to the selected subject, for example, on a server connected to the Internet.

도 2는 다른 전형적인 상황에 대해 도시한다. 텔레비전 시청자(200)는 생방송 발표(예를 들면, "와인과 건강"이라는 주제에 대한 생중계 텔레비전 프로그램 등)를 시청한다.2 illustrates another typical situation. The television viewer 200 watches a live broadcast presentation (eg, a live broadcast television program on the theme of "wine and health").

도 3에 도시된 바와 같이, 하이퍼링크(즉, URL)(303)는 발표 진행 동안에 발표자(301)에 의해 언급될 가능성이 있는 용어 또는 단어(304)와 연관지어진다. 이러한 하이퍼링크 용어 또는 단어는 관련 주제 또는 항목(302)과 관련지어진다. 발표자 워크스테이션(306)(즉, 마이크(307)가 접속되어 있는 컴퓨터 시스템) 상에서 작동되는 음성 인식 시스템(305)은 발표 동안에 상기 하이퍼링크 용어(308) 중의 어느 하나가 발표자에 의해 발음되는 것을 검출하고, 검출된 하이퍼링크 용어가 발음된 시간(309)을 기록한다. 이 시스템은 또한 동일한 세계 표준 시각(309, 310)에 따라서 발표자 디바이스(306)와 수 개의 청중 디바이스(311)(예를 들면, 워크스테이션, 휴대용 컴퓨터, PDA(personal digital assistants), 스마트폰 또는 임의의 다른 타입의 휴대용 컴퓨팅 디바이스 등)를 동기화(312)하여, 발표자(301)와 청중(300)의 상대적 위치에 무관하게, 발표자(301)에 의해 송신되고 청중(300)에 의해 수신되는 정보의 흐름이 언제나 동기화되게 하는 것을 기반으로 한다. 발표 동안에 청중(300)이 관심이 가는 주제(313)를 인식할 때마다, 청중은 즉시 청중 디바이스(311) 상에 마련된 키(314)를 누르는 것에 의해 간단하게 해당 주제를 선택한다. 청중(300)에 의해 주제가 선택된(314) 세계 표준 시각(310)은 청중 디바이스(311) 상에 저장된다.As shown in FIG. 3, the hyperlink (ie, URL) 303 is associated with a term or word 304 that may be referred to by the presenter 301 during the presentation process. These hyperlink terms or words are associated with the related topic or item 302. Speech recognition system 305 operating on presenter workstation 306 (ie, a computer system to which microphone 307 is connected) detects that one of the hyperlink terms 308 is pronounced by the presenter during the presentation. The time 309 at which the detected hyperlink term is pronounced is recorded. The system may also present the presenter device 306 and several audience devices 311 (e.g., workstations, portable computers, personal digital assistants, smartphones, or any other) according to the same world standard time (309, 310). Other types of portable computing devices, etc.) to synchronize 312 information of information transmitted by the presenter 301 and received by the audience 300 regardless of the relative location of the presenter 301 and the audience 300. It is based on keeping the flow synchronized at all times. Each time the audience 300 recognizes a topic 313 of interest during the presentation, the audience immediately selects the topic by simply pressing a key 314 provided on the audience device 311. The world standard time 310 selected 314 by the audience 300 is stored on the audience device 311.

도 3에 도시된 바와 같이, 본 발명은 다음의 원리를 기초로 한다.As shown in Fig. 3, the present invention is based on the following principle.

1. 발표자(301)와 청중(300)의 상대적 위치에 무관하게, 동일한 세계 표준 시각(312)(예를 들면, GPS 수신기(309, 310)에 의해 제공된 GPS 시간 등)에 따른 발표자 워크스테이션(306)과 청중 디바이스(311)의 동기화.1. Regardless of the relative location of the presenter 301 and the audience 300, the presenter workstation according to the same world standard time 312 (e.g., GPS time provided by the GPS receivers 309, 310, etc.) 306 synchronization with the audience device 311.

2. (마이크가 접속되어 있는) 발표자 워크스테이션(306) 상에서 작동하는 음성 인식 시스템(305)을 이용하여 발표(302) 진행 동안에 발표자(301)에 의해 발음된 하이퍼링크 용어(304)(예를 들면, "레스베라트롤(308)")를 검출함.2. The hyperlink term 304 (e.g., pronounced by the presenter 301 during the presentation 302) using the speech recognition system 305 operating on the presenter workstation 306 (to which the microphone is connected). For example, "resveratrol 308".

세계 표준 시각 타이밍 시스템(Universal Timing Systems)Universal Standard Timing Systems

발표자 및 청중의 위치에 무관한 공통 타이밍 시퀀스(timing sequences)는, 예를 들면, GPS(Global Positioning System) 시각 또는 UTC(Universal Time Co-ordinated) 시각(오늘날에는 GMT 및 ZULU 시각으로도 알려져 있음) 등과 같은 절대 타이밍 기준으로부터 도출될 수 있다.Common timing sequences that are independent of the location of the presenter and the audience are, for example, Global Positioning System (GPS) time or Universal Time Co-ordinated (UTC) time (also known as GMT and ZULU time today). May be derived from an absolute timing reference.

정확한 타이밍 신호를 송신하기 위해서, GPS는 지구 상으로부터 10,000마일 떨어져서 55°의 경사를 갖는 궤도를 따라 선회하는 24개의 위성을 이용한다. 지구 상의 어느 곳에서도 소정의 GPS 수신기를 가지고 이러한 타이밍 신호를 이용하여 자신의 위치를 결정할 수 있다. 1575 ㎒의 전송에서는 C/A(clear acquisition) 코드로 지칭되는 1-㎒ 대역폭(bandwidth)의 위상 변조 신호(phase-modulated signal)를 전달한다. GPS 수신기가 적어도 3개의 GPS 위성으로부터 이 신호를 수신하면, 이 GPS 수신기는 대략 30미터의 정확도로 자신의 위도 및 경도를 결정할 수 있다. 지리적 위치의 결정과는 별도로, 오늘날 GPS는 PTTI(Precise Time and Time Interval)의 배포를 위해 널리 사용되고 있다. 이 시스템은 위치의 결정을 위해 TOA(time of arrival) 측정을 이용한다. 동시에 시야 범위에 들어오는 4개의 위성의 TOA를 측정함으로써 위치에 추가하여 시각을 획득할 수 있기 때문에, 사용자가 반드시 정확하게 시간이 설정된 시계를 구비할 필요는 없다. 해발 고도를 알 수 있다면, 3개의 위성으로도 충분하다. 사용자가 알려진 위치에 정지하고 있으면, 원칙적으로는 하나의 위성을 관측하는 것에 의해 시각을 획득할 수 있다. GPS 시각 서비스에 관한 정보는 워싱톤 DC에 소재하는 미국 해군 관측소의 "시각 서비스국(Time Service Department)"에서 운영하는 http://tycho.usno.navy.mil/에서 제공되고 있다.To transmit accurate timing signals, GPS uses 24 satellites that orbit along an orbit with a 55 ° inclination 10,000 miles from Earth. Anywhere on earth, with a given GPS receiver, these timing signals can be used to determine their location. Transmission at 1575 MHz carries a 1-MHz bandwidth phase-modulated signal, referred to as a clear acquisition (C / A) code. When the GPS receiver receives this signal from at least three GPS satellites, the GPS receiver can determine its latitude and longitude with an accuracy of approximately 30 meters. Apart from determining geographic location, GPS is now widely used for the distribution of Precise Time and Time Intervals (PTTIs). The system uses time of arrival (TOA) measurements to determine the location. Since the time can be obtained in addition to the location by measuring the TOA of the four satellites entering the field of view at the same time, the user does not necessarily have to have a precisely set clock. If altitude is known, three satellites are sufficient. If the user is stationary at a known location, in principle, the time can be obtained by observing one satellite. Information about GPS visual services is available at http://tycho.usno.navy.mil/, operated by the "Time Service Department" at the US Naval Observatory in Washington, DC.

오늘날 GPS는 전세계에서 정확한 시간을 제공하는 주요 공급원이다. GPS는 시각 공급원뿐만 아니라 하나의 위치에서 다른 위치로 시각을 전달하는 수단으로서 널리 사용되고 있다. GPS로부터 GPS 시각과, 미국 해군 관측소에 의해 추정되고 생성된 UTC와, 자유 항해하는 각각의 GPS 위성의 원자 시계로부터 입수 가능한 시간의 3가지 종류의 시간을 입수할 수 있다. 미국 콜로라도주에 소재하는 콜로라도 스프링스 부근의 팔콘(Falcon) 항공 기지에 있는 MCS(Master Control Station)는, 전세계에 있는 5개의 모니터 스테이션(monitor stations)으로부터 GPS 위성의 데이터를 수집한다. 칼만 필터 소프트웨어 프로그램(Kalman filter software program)은 각각의 위성 및 그 작동 시계에 대한 시간 에러, 주파수 에러, 주파수 편차(frequency drift) 및 케플러 궤도 변수(Keplerian orbit parameters)를 구한다. 이 정보는 각각의 위성에 업로드(uploaded)되어, 실시간으로 방송될 수 있다. 이 프로세스는 성좌(constellation)에 걸쳐 소수의 나노초(nanoseconds) 내의 GPS 시간 일치성을 제공하고, 위성에 의한 수 미터 내의 정확한 위치 결정을 제공한다.Today, GPS is a major source of accurate time around the world. GPS is widely used as a means of transmitting time from one location to another as well as a visual source. Three types of time are available from GPS, GPS time, UTC estimated and generated by the United States Naval Observatory, and time available from the atomic clocks of each free-navigating GPS satellite. The Master Control Station (MCS) at the Falcon Air Base near Colorado Springs, Colorado, USA, collects GPS satellite data from five monitor stations around the world. The Kalman filter software program finds time error, frequency error, frequency drift, and Kepplerian orbit parameters for each satellite and its operating clock. This information can be uploaded to each satellite and broadcast in real time. This process provides GPS time consistency in a few nanoseconds across the constellation and provides accurate positioning within a few meters by satellite.

제 2의 세계 표준 시각에 관한 표준인 UTC(Universal Time Co-ordinated)는, 지구의 자전과 동기화를 유지하기 위해 리프 세컨드(leap seconds)를 도입한다. GPS 신호로부터 도출 가능한 UTC 시간의 추정치를 제공하기 위해서, UTC 보정의 세 트가 GPS 방송 신호의 일부로서 또한 제공된다. 이 방송 메시지는 GPS 시각의 전체 초와 UTC의 전체 초 사이의 시간차를 포함한다. 이는 데이터 스트림의 완만한 흐름을 처리하거나, 데이터 샘플들 사이의 시각을 계산하는 소프트웨어가 복잡해지게 한다. 본 시스템은 리프 세컨드의 도입을 회피하기 때문에 본 발명에서는 GPS 시각이 바람직하고, GPS 시각은 UTC에 대해 용이하게 연관될 수 있다. UTC(GMT) 시각 서비스에 대한 정보는 http://time.greenwich2000.com/에서 확인할 수 있다.Universal Time Co-ordinated (UTC), the second world standard time standard, introduces leaf seconds to keep the earth rotating and synchronized. In order to provide an estimate of UTC time derivable from the GPS signal, a set of UTC corrections is also provided as part of the GPS broadcast signal. This broadcast message includes the time difference between the total seconds of GPS time and the total seconds of UTC. This complicates the software to handle the gentle flow of the data stream or to calculate the time between data samples. Since the present system avoids the introduction of leaf seconds, the GPS time is preferred in the present invention, and the GPS time can be easily related to UTC. Information about UTC (GMT) time services can be found at http://time.greenwich2000.com/.

GPS 수신기GPS receiver

직접 디지털형(Direct-to-Digital) GPS 수신기는 다음의 웹 사이트에서 설명되어 있다.Direct-to-Digital GPS receivers are described at the following Web site:

http://w3.research.ibm.com/present/gto200038.htmhttp://w3.research.ibm.com/present/gto200038.htm

이는 GPS를 모든 기기(예를 들면, PDA, 이동 전화기, 착용식 컴퓨터, 비디오 카메라 등)에 집적시킬 수 있는 저렴한 소형칩의 예이다. 이 수신기는 IBM과 Leica에 의해 공동으로 개발되었다. CMOS 기법과 통합될 때, SiGe 기법의 고속 아날로그 성능은, 디지털 GPS(Global Positioning System) 수신기에 이 단일 칩을 직접적으로 집적시킬 수 있도록 허용한다. GPS로부터 도출된 위치 정보는, 지도 제작 및 측량에서부터 911 전화 통화 발신자의 차량 추적, 자동화된 농업용 차량 및 로봇 골프 카트까지의 여러 다양한 용도로 이용될 수 있다는 것이 확인되고 있다. 이 수신기 칩은 무선 범위 및 복잡성을 감소시킨다. 여기에는 아날로그 믹서 스테이지(analog mixer stages)가 존재하지 않을 뿐만 아니라, (고품질 필터 등과 같 이) 종래의 2단계 아날로그 하향 변환(two stage analog down conversion)을 필요로 하는 값비싼 이산 부품(discrete component)이 존재하지 않는다. 그 대신에, 입력된 GPS 신호는 안테나에서 직접적으로 완전히 디지털화된 다음, CMOS 기반 칩 내에서 디지털 방식으로 필터링된다. 이러한 직접 디지털화는 매우 적은 전력을 가지고 고속으로 실행될 수 있는 SiGe 기술의 성능에 의해 가능해진 것이고, 이 기술의 핵심은 SiGe 기반의 아날로그-디지털 데이터 변환기이다.This is an example of an inexpensive, compact chip that can integrate GPS into any device (eg, PDAs, mobile phones, wearable computers, video cameras, etc.). The receiver was jointly developed by IBM and Leica. When integrated with the CMOS technique, the high-speed analog performance of the SiGe technique allows the integration of this single chip directly into a digital Global Positioning System (GPS) receiver. It has been confirmed that location information derived from GPS can be used for a variety of purposes, from cartography and surveying to vehicle tracking of 911 telephone call callers, automated agricultural vehicles and robotic golf carts. This receiver chip reduces radio range and complexity. There are no analog mixer stages here, but expensive discrete components that require conventional two stage analog down conversion (such as high quality filters). This does not exist. Instead, the input GPS signal is fully digitized directly at the antenna and then digitally filtered within the CMOS based chip. This direct digitization is made possible by the ability of SiGe technology to run at high speed with very little power, and the core of the technology is a SiGe-based analog-to-digital data converter.

본 발명에 따르면, GPS 또는 GLONASS 수신기는 발표자 워크스테이션(전형적으로는, 퍼스널 컴퓨터) 및 청중 디바이스(예를 들면, 퍼스널 컴퓨터, 착용식 컴퓨터, PDA(Personal Digital Assistants), 스마트 폰, 온보드 이동 컴퓨터(onboard mibile computers) 등)에 집적되거나 접속되어야 한다. GPS 또는 GLONASS 위성으로부터 수신된 세계 표준 시각 타이밍 신호를 사용하여, 동일한 세계 표준 시각에 따라서 발표자 워크스테이션과 청중 디바이스 상의 내부 전자 클로킹 시스템(internal electronic clocking systems)을 초기화하고 동기화한다. GPS 또는 GLONASS 위성이 시야 범위 밖에 있고(예를 들면, 발표자 또는 청중의 디바이스가 건물 내에 있거나 외부 안테나에 접속되지 않는 것 등), 그에 따라 이러한 위성으로부터 타이밍 신호가 수신되지 않는 주기 동안에, 타이밍 정보는 이러한 디바이스의 자동 전자 클로킹 시스템으로부터 계속하여 도출되어야 한다. 디바이스 내에 설치된 클로킹 시스템의 드리프트(drift)에 따라서, 또한 충분한 타이밍 정확도를 유지하고, 사용자의 디바이스가 발표자 워크스테이션 및 방송국과 동일한 세계 표준 시각으로 동기화되었는지 확인하기 위해서, 위성 신호의 주기적인 인식을 다소 자주 수행해야 한다. 실제적으로,In accordance with the present invention, a GPS or GLONASS receiver may be a speaker workstation (typically a personal computer) and an audience device (e.g., a personal computer, a wearable computer, a personal digital assistant (PDA), a smartphone, an onboard mobile computer) or onboard mibile computers). World standard time timing signals received from GPS or GLONASS satellites are used to initialize and synchronize internal electronic clocking systems on presenter workstations and audience devices according to the same world standard time. During a period in which a GPS or GLONASS satellite is out of view (for example, the presenter or audience's device is in a building or not connected to an external antenna, etc.) and thus no timing signal is received from such satellite, the timing information is It must continue to be derived from the automatic electronic clocking system of such a device. Depending on the drift of the clocking system installed in the device, and also to maintain sufficient timing accuracy, and to ensure that the user's device is synchronized to the same world standard time with the presenter workstation and station, the periodic recognition of the satellite signal is somewhat Should be done frequently. Practically,

·사용자 디바이스가 휴대용이거나 차량용 탑재형 디바이스이면, 사용자가 야외로 나가거나 여행 중일 때 위성 신호를 수신할 수 있다.If the user device is a portable or in-vehicle device, it can receive satellite signals when the user is outdoors or traveling.

·사용자 디바이스가 고정되었거나 가정 또는 빌딩 내에 긴 주기 동안 고정된다면, 사용자 디바이스는 GPS 또는 GLONASS 안테나(예를 들면, 건물 지붕에 설치된 안테나 등)가 설치된 실외에 접속되어야 한다.If the user device is fixed or fixed for a long period of time in a home or building, the user device must be connected to the outdoors with a GPS or GLONASS antenna (eg an antenna installed on the building roof).

연속적인 발표 동안에 단어를 검출하는 시스템 및 방법System and method for detecting words during successive presentations

음성 인식은 컴퓨터(또는 다른 타입의 시스템 또는 장치)가 발음된 단어를 식별하는 프로세스이다. 기본적으로, 음성 인식 시스템은 발표자가 말하는 것을 정확하게 인식할 수 있는 컴퓨터이다.Speech recognition is the process by which a computer (or other type of system or device) identifies a pronounced word. Basically, a speech recognition system is a computer that can accurately recognize what a presenter is saying.

음성 인식은 극도로 어려운 작업이다. 기록된 문서와는 다르게, 발음된 단어들 간에 명확한 간격이 존재하지 않는다. 완전한 문장 또는 문장들은 전형적으로 중단없이 발음된다. 또한, 발표 신호에서의 음향적 다양성은 전형적으로 단어 또는 자음(consonants) 및 모음(vowels)의 발음 등과 같은 서브워드 단위(subword units)의 시퀀스에 대한 불확한 매칭을 방해한다. 발표에서 다양성을 발생시키는 주요 원인은 동시 조음(coarticulation), 또는 주어진 어음(speech sound) 또는 단음(phone sound)의 음향적 특성이 생성된 표음적 문맥(phonetic context)에 따라서 달라지는 경향에 기인한다.Speech recognition is an extremely difficult task. Unlike written documents, there is no clear gap between pronounced words. The complete sentence or sentences are typically pronounced without interruption. Also, acoustic diversity in the presentation signal typically hinders inaccurate matching of sequences of subword units, such as the pronunciation of words or consonants and vowels. The main cause of diversity in a presentation is due to the coarticulation, or the tendency for the acoustic characteristics of a given speech or phone sound to vary depending on the generated phonetic context.

음성 인식 시스템은 담화 스타일, 어휘 및 적응된 언어 모델에 따라서 분류될 수 있다. 분리어 인식기(isolated word recognizers)는 발표자가 개별 단어들 사이에 짧은 쉼표를 삽입하는 것을 필요로 한다. 연속 음성 인식기(continuous speech recognizers)는 풍부한 담화에 대해 작동될 수 있지만, 전형적으로 엄격한 언어 모델 또는 문법을 사용하여, 허용 가능한 단어 시퀀스의 개수가 제한된다.Speech recognition systems can be classified according to discourse style, vocabulary and adapted language models. Isolated word recognizers require the presenter to insert a short comma between individual words. Continuous speech recognizers can operate on rich discourse, but typically use a strict language model or grammar to limit the number of acceptable word sequences.

단어 추출기(word spotters)는 특별한 종류의 음성 인식기이다. 이것도 풍부한 담화에 대해 작동될 수 있다. 그러나, 전체적 표기법을 제공하는 것이 아니라, 단어 추출기는 관련된 단어 또는 문장을 선택적으로 탐색한다. 단어 추출은 키워드 인덱싱을 기반으로 정보를 검색하거나 음성 명령 애플리케이션에서 분리어를 인식하는 데 유용하다.Word spotters are a special kind of speech recognizer. This too can work for rich discourse. However, rather than providing a global notation, the word extractor selectively searches for related words or sentences. Word extraction is useful for retrieving information based on keyword indexing or for recognizing delimiters in voice command applications.

오늘날, 다수의 음성 인식 시스템은 본 발명을 지원하기 위해 요구되는 단어 추출 기능을 제공할 수 있다. 이러한 시스템은 발표 진행 동안 또는 발표에서 발표자에 의해 발음될 수 있는 사전 정의된 단어 또는 문장(하이퍼링크 용어)의 검출을 가능하게 한다. 이러한 음성 인식 시스템은, 예를 들면, 다음의 특허에 개시되어 있다.Today, many speech recognition systems can provide the word extraction functionality required to support the present invention. Such a system enables the detection of predefined words or sentences (hyperlink terms) that can be pronounced by the presenter during or during the presentation. Such a speech recognition system is disclosed in the following patent, for example.

"Wordspotting for voice editing and indexing"라는 명칭의 미국 특허 제 5,199,077 호는, HMM(hidden Markov models)을 기반으로 하는 단어 추출 기법에 관해 개시하였다. 이 기법은 발표자가 동적으로 키워드를 지정하게 하고, 키워드의 단일 반복에 의해 연관된 HMM을 트레이닝시킬 수 있게 한다. 비 키워드 발표(non-keyword speech)는 연속적인 발표의 사전 기록된 샘플로 트레이닝되어 있는 HMM을 사용하도록 모델링되어 있다. 이러한 단어 추출기는 음성 메일(voice mail) 또는 혼합형 매체 문서(mixed-media documents)를 편집하는 등의 대화형 애플리케이션 및 단일 발표자 음향 또는 영상 기록에서의 키워드 인덱싱용으로 의도된 것이다.U.S. Patent No. 5,199,077 entitled "Wordspotting for voice editing and indexing" discloses a word extraction technique based on hidden Markov models (HMM). This technique allows the presenter to dynamically specify keywords and train the associated HMM by a single iteration of the keywords. Non-keyword speech is modeled to use HMMs that have been trained with pre-recorded samples of successive speeches. Such word extractors are intended for interactive applications, such as editing voice mail or mixed-media documents, and for indexing keywords in a single presenter sound or video record.

"Method for word spotting in continuous speech"라는 명칭의 미국 특허 제 5,425,129 호는, 원하는 목록으로부터 단어 또는 문장의 존재에 대한 디지털화된 발표 데이터 채널을 분석하는 시스템 및 방법에 관해 개시하였다. IBM 연속 음성 인식 시스템(IBM Continuous Speech Recognition System : ICSRS)과 관련하여 구현된 본 발명에 따른 시스템 및 방법은, 무관한 표음 데이터가 존재할 때 사전 지정된 단어 또는 문장을 추출하는 서브시스템(subsystem)을 음성 인식 시스템 내에 제공한다.US Patent No. 5,425,129, entitled "Method for word spotting in continuous speech," discloses a system and method for analyzing a digitized presentation data channel for the presence of words or sentences from a desired list. The system and method according to the invention, implemented in connection with the IBM Continuous Speech Recognition System (ICSRS), comprises a subsystem for extracting a pre-specified word or sentence when there is irrelevant phonetic data. Provided within the recognition system.

"Word spotting using both filler and phone recognition"라는 명칭의 미국 특허 제 5,950,159 호는 음향 데이터 내에서 키워드를 확인하는 단어 추출 시스템 및 방법에 관해 개시한다. 이 방법은 필러 인식 단계(filler recognition phase) 및 키워드 인식 단계(keyword recognition phase)를 포함하는데, 필러 인식 단계 동안에 음향 데이터는 단음을 식별하고, 단음에 대한 일시적 구획 문자(temporal delimiters) 및 가능도 스코어(likelihood scores)를 생성하도록 처리되며, 키워드 인식 단계 동안에, 음향 데이터는 단음의 시퀀스를 포함하는 지정된 키워드의 사례를 식별하도록 처리되고, 여기에서 필러 인식 단계에서 생성된 일시적 구획 문자 및 가능도 스코어는 키워드 인식 단계에서 사용된다.U.S. Patent 5,950,159, entitled "Word spotting using both filler and phone recognition", discloses a word extraction system and method for identifying keywords in acoustic data. The method includes a filler recognition phase and a keyword recognition phase, during which the acoustic data identifies a single tone, temporal delimiters and likelihood scores for the single note. to generate likelihood scores, and during the keyword recognition step, the acoustic data is processed to identify instances of the specified keyword comprising a sequence of monotones, where the temporal delimiter and likelihood scores generated in the filler recognition step are Used in the keyword recognition phase.

"System and device for advanced voice recognition word spotting"라는 명칭의 미국 특허 제 6,006,185 호는 발표자에 무관한, 연속 발표, 단어 추출 음성 인식 시스템 및 방법에 관해 개시하였다. 발음에서 음소(phonemes)의 경계는 빠르고 정확하게 구분되어야 한다. 발음은 음소의 경계를 기초로 파장 세그먼트(wave segments)로 분리된다. 음성 인식 엔진(voice recognition engine)은 여러 개의 파장 세그먼트에 대해 여러 번 검색하고, 해당 결과를 분석하여 정확하게 발음되는 단어를 식별한다.US Pat. No. 6,006,185, entitled "System and device for advanced voice recognition word spotting," discloses a speaker-independent, continuous presentation, word extraction speech recognition system and method. In pronunciation, the boundaries of phonemes must be quickly and accurately distinguished. Pronunciation is divided into wave segments based on phoneme boundaries. A voice recognition engine searches multiple wavelength segments several times and analyzes the results to identify words that are pronounced correctly.

"Fast vocabulary independent method and apparatus for spotting words in speech"라는 명칭의 미국 특허 제 6,073,095 호는, 발표문 내에서 단어/단음 시퀀스를 추출하기 위해 사전 처리 단계와 개략-상세 검색 전략(coarse-to-detailed search strategy)을 이용함으로써 발표문 내에서 단어를 추출하는 신속한 어휘 비의존적 방법에 관해 개시하였다.U.S. Patent No. 6,073,095 entitled "Fast vocabulary independent method and apparatus for spotting words in speech" describes a pre-processing step and a coarse-to-detailed search strategy for extracting word / single sequences in a presentation. strategy), a rapid vocabulary-independent method of extracting words from a presentation is disclosed.

"System and method for automatic audio content analysis for word spotting, indexing, classification and retrieval"라는 명칭의 미국 특허 제 6,185,527 호는, 후속적인 정보 검색을 위해 음향 스트림을 인덱싱하는 시스템 및 방법에 관해 개시하였다.US Patent No. 6,185,527, entitled "System and method for automatic audio content analysis for word spotting, indexing, classification and retrieval", discloses a system and method for indexing an acoustic stream for subsequent information retrieval.

"Word-spotting speech recognition device and system"라는 명칭의 미국 특허 제 6,230,126 호는, 인식 객체의 특징을 저장하는 사전을 포함하는 음성 인식 디바이스에 관해 개시하였다. 이 디바이스는 입력된 발표의 특징과 인식 객체의 특징을 비교하는 매칭 장치(matching unit)와, 매칭 장치가 입력된 발표와 인식 객체 중의 하나 사이의 실질적 유사성을 찾았을 때, 입력된 발표를 기초로 하는 사전 내의 음소에 대한 시간 길이를 업데이트하는 사전 업데이트 장치(dictionary updating unit)를 더 포함한다.U. S. Patent No. 6,230, 126, entitled " Word-spotting speech recognition device and system, " discloses a speech recognition device that includes a dictionary that stores features of a recognition object. The device compares the characteristics of the input presentation with the characteristics of the recognition object and based on the input presentation when the matching device finds a substantial similarity between the input presentation and one of the recognition objects. The dictionary further includes a dictionary updating unit for updating the length of time for the phonemes in the dictionary.

음성 인식 시스템의 기본적인 문제점은, 알려지지 않았거나 일반적인 발표 스타일, 어휘, 노이즈 레벨 및 언어 모델에 기초하여, 발표자에 무관하게 연속적인 방식으로 "단어 추출"(즉, 사전 지정된 키워드 또는 용어의 식별)을 수행하는 데 있어서, 최신 기술의 작용을 신뢰할 수 없다는 데 있다. 반면에, 본 발명에서는 사전에 발표자가 자신의 목소리, 자신의 특정한 발음 스타일 및 특정하게 조정된 어휘 및 언어 모델을 이용하여 해당 시스템을 트레이닝시키고 동조시킬 수 있기 때문에, 본 발명에서 필요로 하는 자동 단어 추출 기능은, 구현하기가 비교적 용이하다.A fundamental problem with speech recognition systems is that they can perform "word extraction" (ie, identification of pre-specified keywords or terms) in a continuous manner independent of the presenter, based on unknown or common presentation styles, vocabulary, noise levels, and language models. In doing so, the operation of the latest technology is not reliable. On the other hand, in the present invention, since the presenter can train and tune the system using his voice, his specific pronunciation style, and a specially adjusted vocabulary and language model, the automatic words required by the present invention are required. The extraction function is relatively easy to implement.

본 발명의 일측면에 따르면, 발표(302) 동안에 발표자(301)에 의해 발음될 사전 선택된 발음(304)(통상적으로는 구어, 용어 또는 문장)으로부터, 웹 상에서의 관련된 해당 데이터(303)로의 하이퍼링크를 생성하고, 상기 하이퍼링크 용어(304)를 식별(단어 추출)하는 음성 인식 시스템(305)을 트레이닝시키는 시스템 및 방법이 개시되어 있다.According to one aspect of the invention, hyper from preselected pronunciation 304 (typically spoken, term or sentence) to be pronounced by presenter 301 during presentation 302 to relevant data 303 on the web. A system and method are disclosed for training a speech recognition system 305 that creates a link and identifies (word extraction) the hyperlink term 304.

본 발명의 다른 측면에 따르면, 생중계 발표 동안에 하이퍼링크 용어(304)가 발표자(301)에 의해 발음될 때, 하이퍼링크 용어(304)를 자동적으로 인식(단어 추출)하고, 네트워크 서버 상에서 인식된 하이퍼링크 용어(304)의 목록, 연관된 네트워크 어드레스(즉, URL)(303) 및 이러한 하이퍼링크 용어의 인식에 대응하는 세계 표준 시각(309)을 포함하는 테이블을 생성하는 시스템 및 방법이 개시되어 있다.According to another aspect of the present invention, when the hyperlink term 304 is pronounced by the presenter 301 during live broadcast presentation, the hyperlink term 304 is automatically recognized (word extracted) and the hyper recognized on the network server. A system and method are disclosed for generating a table that includes a list of link terms 304, an associated network address (ie, URL) 303, and a world standard time 309 corresponding to the recognition of such hyperlink terms.

본 발명의 또 다른 측면에 따르면, 발표(302) 진행 동안에, 청중(300)이 하 나 이상의 관심이 가는 주제(313)를 선택(314)하는 것에 해당하는 세계 표준 시각(310)을 청중 디바이스(311)에 기록하는 시스템 및 방법이 개시되어 있다.According to another aspect of the invention, during the presentation 302, the audience 300 receives a world standard time 310 corresponding to selecting 314 one or more topics of interest 313 of interest to the audience device ( A system and method for recording to 311) is disclosed.

본 발명의 다른 측면에 따르면, 청중(300)이 컴퓨터 네트워크에 접속된 서버로부터, 하이퍼링크 용어(304)에 관련된 해당 정보에 액세스하고 검색할 수 있게 하는 시스템 및 방법이 개시되어 있다.According to another aspect of the present invention, a system and method are disclosed that enable an audience 300 to access and retrieve relevant information related to hyperlink term 304 from a server connected to a computer network.

발표 하이퍼링크 테이블을 생성하고 음성 인식 시스템을 하이퍼링크 용어의 어휘로 트레이닝시키는 방법How to create a presentation hyperlink table and train the speech recognition system with vocabulary of hyperlink terms

도 17을 참조하면, 본 발명은 발표 이전에, 발표자(301)에 의해 사용되어 발표(302) 동안에 발표자(301)에 의해 발음될 선택된 발음(304)(통상적으로는 발음된 단어 또는 문장)으로부터 바람직하게는 웹인 컴퓨터 네트워크에 접속된 하나 이상의 서버 상의 관련된 해당 데이터(303)로의 하이퍼링크를 생성하게 하고, 상기 하이퍼링크 용어(304)의 어휘를 가지고 발표에 대한 단어를 추출하도록 음성 인식 시스템(305)을 트레이닝시킬 수 있는 시스템, 방법 및 컴퓨터 프로그램에 관해 개시하였다. 이 방법은,Referring to FIG. 17, prior to a presentation, the present invention may be used by the presenter 301 to select from the selected pronunciation 304 (typically a pronounced word or sentence) to be pronounced by the presenter 301 during the presentation 302. Speech recognition system 305 to generate hyperlinks to relevant data 303 on one or more servers connected to a computer network, preferably web, and extract words for presentation with the vocabulary of hyperlink term 304. A system, method, and computer program that can train) are disclosed. This way,

발표의 생성 또는 편집 동안에,During the creation or editing of a presentation,

· 발표문 또는 발표 초안(draft text)(400)을 편집하는 단계(1701)와,Editing the presentation or draft text 400 (1701),

· 발표문(500)에서 하이퍼링크가 반드시 생성되어야 하는 복수의 관련 용어 또는 단어(501)를 선택하고 마킹(marking)하는 단계(1702)와,Selecting and marking a plurality of related terms or words 501 in which the hyperlink should be generated in the presentation 500, 1702;

· 발표를 위한 발표 하이퍼링크 테이블(600)을 생성하는 단계(1703)와,Creating a presentation hyperlink table 600 for the presentation (1703),

· 선택된 용어 또는 단어(501)와, 컴퓨터 네트워크(908)에 접속된 하나 이상의 서버(909) 상에 위치된 멀티미디어 정보 또는 서비스 사이의 하이퍼링크를 발표 하이퍼링크 테이블(600) 내에서 정의하는 단계-상기 단계는 선택된 하이퍼링크 용어 또는 단어(501) 중의 각각에 대하여, 명칭 및/또는 설명(601)(바람직하게는 간단한 설명)을 할당하는 단계(1704)와, 네트워크(908) 내에 목적지 어드레스(602)(예를 들면 URL 등)를 할당하여 원하는 멀티미디어 정보 또는 서비스에 액세스하는 단계(1705)와, 할당된 명칭(또는 설명)(601) 및/또는 목적지 어드레스(602)를 발표 하이퍼링크 테이블(600) 내에 저장하는 단계(1706)를 포함함-와,Defining in the presentation hyperlink table 600 a hyperlink between the selected term or word 501 and multimedia information or services located on one or more servers 909 connected to the computer network 908. The step includes assigning a name and / or description 601 (preferably a brief description) 1704 for each of the selected hyperlink terms or words 501 and a destination address 602 within the network 908. Step 1705 to access the desired multimedia information or service by assigning (eg, a URL, etc.), and an assigned name (or description) 601 and / or a destination address 602 are presented. 1704)

발표 하이퍼링크 테이블(600) 내에서 하이퍼링크가 정의되면,Once a hyperlink is defined within the presentation hyperlink table 600,

· 발표 동안에 하이퍼링크 용어가 발표자(700)에 의해 발음될 때 이러한 하이퍼링크 용어를 자동적으로 인식하도록, 발표자 워크스테이션(702) 상에서 실행되는 음성 인식 시스템(701)을 하이퍼링크 용어(602, 703)의 어휘로 트레이닝시키는 단계(1707)를 포함한다.The speech recognition system 701 running on the presenter workstation 702 so that the hyperlink term is automatically recognized when the hyperlink term is pronounced by the presenter 700 during the presentation. Training 1707 with a vocabulary of.

도 4는 회의실에서 수행되거나 라디오 또는 텔레비전에서 방송되는 예시적인 발표(예를 들면, "와인과 건강"이라는 제목의 컨퍼런스 등)의 발표문(400)을 나타낸다. 생중계 발표 동안에, 발표자(301)는 이러한 발표문 전체를 낭독하거나 발표자는 편의상 몇몇 부분의 낭독을 의도적으로 생략하거나, 순서를 바꾸거나, 대체 부분 또는 추가적인 해설을 도입할 수 있다.4 shows a presentation 400 of an exemplary presentation (eg, a conference entitled “Wine and Health”, etc.) performed in a conference room or broadcast on a radio or television. During a live presentation, the presenter 301 may read the entirety of such a statement, or the presenter may intentionally omit some parts of the reading, change the order, or introduce a replacement or additional commentary for convenience.

도 5는 발표자, 프로그램 편집자(또는 편집을 위탁받은 자)가 발표문(500)을 획득하고 소정의 용어(501)("페놀", "레스베라트롤", "타닌", "엘라직 산(Ellagic acid)", "하이드록시시나메이트(Hydroxycinnamates)", "자유 라디칼(Free radicals)", "안토시아닌(Anthocyanin)", "갈릭 산(Gallic acid)" 등과 같은 단어 또는 문장)를 선택하여 웹 상에서 액세스 가능한 추가적 정보와 연관시키는 방법에 관해 나타내었다. 하이퍼링크 용어로 지칭된 이러한 선택된 용어는 각각, 관련된 정보 또는 서비스를 검색할 수 있게 하는 웹 상의 네트워크 어드레스(즉, 대응하는 URL)에 연관되어야 한다.5 shows that the presenter, the program editor (or the person entrusted with the editing) obtains the presentation 500 and the predetermined terms 501 ("phenol", "resveratrol", "tanin", "ellagic acid"). ", Words or phrases such as" Hydroxycinnamates "," Free radicals "," Anthocyanin "," Gallic acid ", etc.) The method of associating with information is shown. Each of these selected terms, referred to as hyperlink terms, must be associated with a network address (ie, a corresponding URL) on the web that allows searching for relevant information or services.

도 6은 발표자, 프로그램 편집자(또는 편집을 위탁받은 자)가 발표자 워크스테이션(306) 상에서 각각의 선택된 하이퍼링크 용어(601)(즉, 발표문에서 선택된 단어 또는 문장, 예를 들면 "레스베라트롤" 등)를, 웹(602) 상의 대응하는 URL(예를 들면, http://www.ag.uiuc.edu/~ffh/resvera.html 등)과 연관시키는 발표 하이퍼링크 테이블(600)을 생성하는 것을 나타낸다.6 shows that the presenter, program editor (or entrusted editor), each selected hyperlink term 601 on the presenter workstation 306 (i.e., the word or sentence selected in the announcement, eg, "resveratrol", etc.). To generate a presentation hyperlink table 600 that associates a... With a corresponding URL on the web 602 (eg, http://www.ag.uiuc.edu/~ffh/resvera.html, etc.). .

도 7은 발표 이전에, 발표자(700)가 자신의 워크스테이션(702) 상에 설치된 단어 추출 기능을 갖는 음성 인식 시스템(701)을 하이퍼링크 용어(703)(예를 들면, "페놀" "레스베라트롤", "타닌" 등)의 어휘를 가지고 트레이닝시키는 것을 나타낸다. 본 발명의 특정한 실시예에서, 단어 추출 기능은 IBM 비아보이스 소프트웨어 제품(IBM ViaVoice sofware product) 상에서 실행되는 IBM 연속 음성 인식 시스템(ICSRS)을 이용하여 구현되었다.FIG. 7 illustrates a speech recognition system 701 having a word extraction function installed on its workstation 702 by the presenter 700 prior to the presentation of the hyperlink term 703 (eg, “phenol” “resveratrol”). Training with a vocabulary of ", " tannin. &Quot; In a particular embodiment of the invention, the word extraction function is implemented using an IBM Continuous Speech Recognition System (ICSRS) running on an IBM ViaVoice software product.

발표 동안에 하이퍼링크 용어를 인식하고, 발표 서버 상에 발표 하이퍼링크 타임 테이블을 생성하는 방법How to recognize hyperlink terminology during a presentation and create a presentation hyperlink timetable on the presentation server

도 18에 도시된 바와 같이, 본 발명은 또한 발표자 워크스테이션(802) 내에서 사용되어, 생중계 발표 동안에 발표자(800)에 의해 하이퍼링크 용어(803)가 발음될 때 하이퍼링크 용어(803)를 인식하고, 네트워크(908)에 접속된 발표 서버(907) 상에서, 발표 하이퍼링크 타임 테이블(906)을 생성하며, 인식된 하이퍼링크 용어(304)의 시퀀스(905), 대응하는 네트워크 어드레스(즉, URL)(303) 및 이러한 하이퍼링크 용어가 인식된 세계 표준 시각(309)을 포함하는 기록을 가지고 이 테이블을 업데이트하는 시스템, 방법 및 컴퓨터 프로그램에 관해 개시한다. 보다 구체적으로, 본 방법은,As shown in FIG. 18, the present invention is also used within the presenter workstation 802 to recognize the hyperlink term 803 when the hyperlink term 803 is pronounced by the presenter 800 during a live broadcast presentation. And, on a presentation server 907 connected to the network 908, generates a presentation hyperlink time table 906, a sequence 905 of recognized hyperlink terms 304, a corresponding network address (ie, a URL). 303 and a system, method, and computer program for updating this table with a record comprising a world standard time 309 where such hyperlink terms are recognized. More specifically, the method,

· 네트워크(908)에 접속된 발표 서버(907) 상에서 발표를 위한 발표 하이퍼링크 타임 테이블(906)을 생성하는 단계(1801)와,Generating (1801) a presentation hyperlink time table 906 for presentation on a presentation server 907 connected to the network 908,

발표 동안에,During the presentation,

· 발표자 워크스테이션(802) 상에서 실행되고, 해당 발표를 위해 용도에 맞게 트레이닝되어 있는 음성 인식 시스템(801)을 이용하여, 발표에 대한 "단어 추출"을 수행함으로써 발표자(800)에 의한 하이퍼링크 용어(803)의 발음을 인식하는 단계(1802)와,Hyperlink terminology by the presenter 800 by performing a "word extraction" on the presentation using a speech recognition system 801 running on the presenter workstation 802 and trained for that purpose. Recognizing a pronunciation of 1803 (1802),

인식된 각각의 하이퍼링크 용어(803)에 대하여,For each recognized hyperlink term 803,

· 세계 표준 시각 디바이스를 이용하여, 하이퍼링크 용어(903)의 인식에 대응하는 세계 표준 시각(904)을 결정하는 단계(1803)와,Determining, using a world standard time device, a world standard time 904 corresponding to the recognition of the hyperlink term 903, 1803;

· 하이퍼링크 용어(903)가 인식된 세계 표준 시각(904)과, 발표 하이퍼링크 테이블(600)을 이용하여 인식된 하이퍼링크 용어(803)의 명칭 또는 짧은 설명(601)과, 발표 하이퍼링크 테이블(600)을 이용하여 인식된 하이퍼링크 용어(803)에 대응하는 네트워크 어드레스(1005)(즉, URL)를 포함하는 기록을 생성하는 단계(1804)와,The world standard time 904 at which the hyperlink term 903 has been recognized, the name or short description 601 of the hyperlink term 803 recognized using the presentation hyperlink table 600, and the presentation hyperlink table. Generating (1804) a record comprising a network address 1005 (ie, a URL) corresponding to the recognized hyperlink term 803 using 600;

· 발표자 워크스테이션(902)으로부터 네트워크(908)를 통해 액세스 가능한 발표 서버(907) 상에 저장된 발표 하이퍼링크 타임 테이블(906, 1000) 내에 상기 기록을 저장하는 단계(1805)를 포함한다.Storing said record in a presentation hyperlink time table 906, 1000 stored on a presentation server 907 accessible from the presenter workstation 902 via a network 908.

도 8은 발표자(800)가 자신의 발표를 표현하는 동안에, 발표자 워크스테이션(802)에 설치된 음성 인식 시스템(801) 상에서 실행되고(예를 들면, IBM 비아보이스 소프트웨어 제품 상에서 실행되는 IBM 연속 음성 인식 시스템(ICSRS)을 이용하여 구현됨), 사전에 발표자에 의해 트레이닝되어 있는 "단어 추출" 기능은, 음성 스트림으로부터 하이퍼링크 용어(803)를 자동적으로 검출하는 방식을 도시하고 있다.8 shows an IBM continuous speech recognition running on a speech recognition system 801 installed on the presenter workstation 802 (eg, on an IBM ViaVoice software product) while the presenter 800 presents his presentation. Implemented using the system (ICSRS), a "word extraction" function, previously trained by the presenter, illustrates how to automatically detect the hyperlink term 803 from the speech stream.

도 9는 발표자 워크스테이션(902) 상에서 실행되는 음성 인식 시스템(901)을 이용하여 하이퍼링크 용어(903)(예를 들면, "레스베라트롤" 등)를 인식한 후에,9 recognizes a hyperlink term 903 (e.g., "resveratrol", etc.) using a speech recognition system 901 running on the presenter workstation 902,

· 하이퍼링크 용어(903)의 인식에 대응하는 세계 표준 시각(904)(예를 들면, 12/05/2001 14:23:18)과,The world standard time 904 corresponding to the recognition of the hyperlink term 903 (eg 12/05/2001 14:23:18),

· 하이퍼링크 용어 명칭 또는 짧은 설명(601)(예를 들면, "레스베라트롤")과,A hyperlink term name or short description 601 (eg, "resveratrol"),

· 연관된 URL(602)(예를 들면, http://www.ag.uiuc.edu/~ffh/resvera.html) (마지막 2개는 발표 하이퍼링크 테이블(600)로부터 추출됨)을 발표 서버(907) 상에 위치된 발표 하이퍼링크 타임 테이블(906)로 전달(905)하고 저장하는 것을 나타낸다.The associated URL 602 (eg, http://www.ag.uiuc.edu/~ffh/resvera.html) (the last two are extracted from the presentation hyperlink table 600) and the presentation server ( Forwarding 905 and storing to a presentation hyperlink time table 906 located on 907.

도 10은 발표(예를 들면, "와인과 건강"(1001)이라는 주제에 대한 발표)의 끝에 표시된, 발표 서버(907) 상에 저장된 발표 하이퍼링크 타임 테이블(1000)(906)의 일례를 나타낸다. 이 테이블의 헤더는 URL(1002)(http://www.directbuyer.com/conference-0173.htm/) 또는 발표 하이퍼링크 타임 테이블(1000)이 발표 서버(907) 내에 저장되어 있는 네트워크 어드레스(예를 들면, www.directbuyer.com)를 포함한다. 네트워크 내에서 발표 하이퍼링크 타임 테이블(1000)을 찾을 수 있는 URL은, 사전에 청중에게 공지되어 있어야 한다. 이 테이블 내의 각 열은, 발표 동안에 발표자에 의해 발표되고 음성 인식 시스템(901)에 의해 인식된 하이퍼링크 용어(903)에 해당한다. 각각의 열은 제각기,FIG. 10 shows an example of a presentation hyperlink time table 1000, 906 stored on a presentation server 907, displayed at the end of a presentation (eg, a presentation on the subject “Wine and Health” 1001). . The header of this table may be a URL 1002 (http://www.directbuyer.com/conference-0173.htm/) or a network address where the announcement hyperlink time table 1000 is stored within the announcement server 907 (e.g., For example, www.directbuyer.com. The URL where the presentation hyperlink time table 1000 can be found within the network must be known to the audience in advance. Each column in this table corresponds to a hyperlink term 903 released by the presenter during the presentation and recognized by the speech recognition system 901. Each row is a separate

· 발표 동안에 발음된 하이퍼링크 용어(903)에 대한 음성 인식 시스템(901)의 인식에 대응하는 세계 표준 시각(1003)과,A world standard time 1003 corresponding to the recognition of the speech recognition system 901 to the pronounced hyperlink term 903 during the presentation,

· 발표 하이퍼링크 테이블(600)로부터 복사되고, 인식된 하이퍼링크 용어(903)에 대한 명칭 및 또는 짧은 설명(601, 1004)과,A name and / or short description 601, 1004 for the recognized hyperlink term 903, copied from the presentation hyperlink table 600, and

· 발표 하이퍼링크 테이블(600)로부터 복사되고, 인식된 하이퍼링크 용어(903)에 대응하는 URL(602, 1005)에 대응한다.Copied from the presentation hyperlink table 600 and corresponding to URLs 602 and 1005 corresponding to the recognized hyperlink term 903.

선택 하이퍼링크 타임 테이블을 생성하고 관심이 가는 주제를 선택하는 방법How to create a selection hyperlink timetable and select topics of interest

도 19에 도시된 바와 같이, 본 발명은 또한, 발표 직후 또는 시간이 경과한 후에 주제에 관련된 추가적인 정보를 수신하기 위해서, 청중 디바이스(1102) 상에 선택 하이퍼링크 타임 테이블(1106)을 생성하고, 발표 동안에 청중(1100)이 관심이 가는 주제(1103)를 선택(1104)하는 것에 대응하는 세계 표준 시각(1105)의 시퀀스를 이 테이블 내에 기록하는 시스템, 방법 및 컴퓨터 프로그램에 관해 개시한다. 보다 구체적으로, 청중 디바이스 내에서 사용되는 이 방법은,As shown in FIG. 19, the present invention also creates an optional hyperlink time table 1106 on the audience device 1102 to receive additional information related to the subject immediately after the announcement or after time elapses. Disclosed are a system, method, and computer program for recording in this table a sequence of world standard time 1105 corresponding to the selection 1104 of an audience 1100 of interest during the presentation. More specifically, this method used in an audience device,

· 청중 디바이스(1102) 상에 선택 하이퍼링크 타임 테이블(1200)을 생성하는 단계(1901)와,Generating 1901 a selection hyperlink time table 1200 on an audience device 1102;

· 선택 하이퍼링크 타임 테이블(1200)에, 발표 서버(907) 상에 저장된 발표 하이퍼링크 타임 테이블(906)의 네트워크 어드레스(즉, URL)(1201)를 기록하는 단계(1902)와,Recording 1902 the network address (i.e., URL) 1201 of the presentation hyperlink time table 906 stored on the presentation server 907 in the selection hyperlink time table 1200;

· 발표자(1101)에 의해 발표된 발표(1107)를 청취하는 단계(1903)와,Listening to the announcement 1107 announced by the presenter 1101, and

· 발표 동안에, 추가적인 정보 또는 서비스가 요구되는 관심 대상이 되는 주제(1103)를 인식하는 단계(1904)와,During the presentation, recognizing 1904 a subject of interest 1103 that requires additional information or services, and

· 청중 디바이스(1102) 상에서 선택 명령(1104)을 입력하는 것에 의해 이러한 관심 대상이 되는 주제(1103)를 선택하는 단계(1905)와,Selecting 1905 a subject of interest 1103 by inputting a selection command 1104 on the audience device 1102, and

· 청중 디바이스에 집적되거나 접속되는 세계 표준 시각 디바이스(예를 들면 GPS 수신기 등)를 이용하여 당시의 세계 표준 시각(1105)을 결정하는 단계(1906)와,Determining (1906) the world standard time 1105 at that time using a world standard time device (e.g. a GPS receiver, etc.) integrated or connected to the audience device;

· 선택 하이퍼링크 타임 테이블(1106, 1202) 내에 이러한 당시의 세계 표준 시각(1105)을 기록하는 단계(1907)를 포함한다.Recording 1907 the global standard time 1105 at this time in the selection hyperlink time tables 1106 and 1202.

도 11은 발표 동안에, 청중(1100)이 휴대용 청중 디바이스(1102)를 이용하여 독립 모드(stand-alone mode)(즉, 고립되고, 네트워크에 접속되지 않은 모드)로 작동시키는 것을 도시한다. 도 11에 상세하게 설명된 특정한 실시예에 따르면, 발표 동안에 청중(1100)이 관심이 가는 주제(1103)를 인식할 때마다, 청중(1100)은 즉시 청중 디바이스(1102) 상에 마련된(1104)키를 단순히 누르는 것에 의해 이러한 주제를 즉시 선택한다. 청중에 의한 주제 선택에 대응하는 세계 표준 시각(1105)은 청중 디바이스(1102) 상의 선택 하이퍼링크 타임 테이블(1106) 내에 저장된다.FIG. 11 shows that during the presentation, the audience 1100 operates in a stand-alone mode (ie, isolated and not connected to the network) using the portable audience device 1102. According to the particular embodiment described in detail in FIG. 11, whenever the audience 1100 recognizes a topic 1103 of interest during the presentation, the audience 1100 is immediately provided 1104 on the audience device 1102. Select these topics instantly by simply pressing a key. The world standard time 1105 corresponding to the topic selection by the audience is stored in the selection hyperlink time table 1106 on the audience device 1102.

도 12는 발표 동안에 청중 디바이스(1102) 상에 생성된 전형적인 선택 하이퍼링크 타임 테이블(1200)을 도시한다. 이 테이블 내의 각 행은 발표 동안에 청중이 관심이 가는 주제를 선택하는 것에 해당되는 서로 다른 세계 표준 시각에 대응한다. 이 테이블의 헤더는 발표에 대응하는 발표 하이퍼링크 타임 테이블(906, 1000)의 URL(1201)(예를 들면, http://www.directbuyer.com/conference-0173,htm/)을 포함한다. 발표 서버(907)(예를 들면, www.directbuyer.com 등) 상의 발표 하이퍼링크 타임 테이블(906, 1000)은, 발표 동안에 발표자 워크스테이션(902)으로부터 업데이트된다. 앞서 언급된 바와 같이, 발표 서버의 URL은 청중에게 사전에 제공되어 청중이 국부적으로 선택 하이퍼링크 타임 테이블(1200)을 생성할 수 있게 해야 한다.12 shows a typical selection hyperlink time table 1200 generated on the audience device 1102 during the presentation. Each row in this table corresponds to a different world standard time that corresponds to selecting the topic of interest to the audience during the presentation. The header of this table contains the URL 1201 (eg, http://www.directbuyer.com/conference-0173,htm/) of the announcement hyperlink time tables 906 and 1000 corresponding to the announcement. The presentation hyperlink time tables 906 and 1000 on the presentation server 907 (eg, www.directbuyer.com, etc.) are updated from the presenter workstation 902 during the presentation. As mentioned above, the URL of the presentation server must be provided to the audience in advance so that the audience can generate a locally selected hyperlink time table 1200.

하이퍼링크를 검색하고 정보에 액세스하는 방법How to Retrieve Hyperlinks and Access Information

도 20에 도시된 바와 같이, 본 발명은 또한, 청중이 해당되는 선택(314)을 수행할 때 활성화되는 하이퍼링크 용어(304)와 관련된 해당 데이터 또는 정보를 청중(300)이 웹으로부터 액세스하고 검색할 수 있게 하는 시스템, 방법 및 컴퓨터 프로그램에 관해 개시한다. 보다 구체적으로, 청중 디바이스(1300) 상에서 이용되는 이러한 방법은,As shown in FIG. 20, the present invention also provides that the audience 300 accesses and retrieves from the web the corresponding data or information related to the hyperlink term 304 that is activated when the audience performs the corresponding selection 314. Disclosed are a system, method, and computer program that make it possible. More specifically, this method used on audience device 1300,

· 발표 서버(1303)에 액세스하는 단계(2001)와,Accessing the presentation server 1303 (2001);

청중 디바이스(1300) 상에 위치하는 선택 하이퍼링크 타임 테이블(1304, 1200) 내에 기록된 각각의 세계 표준 시각(1202)에 대하여,For each world standard time 1202 recorded in a selection hyperlink time table 1304, 1200 located on an audience device 1300,

· 발표 서버(1303)에 기록된 세계 표준 시각(1306)을 전달하는 단계(2002)와,Forwarding (2002) the world standard time 1306 recorded in the presentation server 1303;

· 발표 서버(1303) 상에 위치하는 발표 하이퍼링크 타임 테이블(1302, 1000) 내에서, 상기 세계 표준 시각(1306)과 연관된 하이퍼링크 용어(1004)(해당 세계 표준 시각에 활성화된 하이퍼링크 용어)를 식별하는 단계(2003)와,A hyperlink term 1004 (hyperlink term activated at that world standard time) associated with the world standard time 1306 within the presentation hyperlink time tables 1302 and 1000 located on the presentation server 1303. Identifying (2003),

· 발표 서버(1303) 내에 위치된 발표 하이퍼링크 타임 테이블(1302, 1000)에서, 선택된 하이퍼링크 용어의 명칭(또는 설명)(1004) 및 목적지 어드레스(URL)(1005, 1307)를 검색하는 단계(2004)와,Retrieving the name (or description) 1004 and the destination address (URL) 1005, 1307 of the selected hyperlink term from the presentation hyperlink time tables 1302, 1000 located within the presentation server 1303 ( 2004),

· 검색된 하이퍼링크 명칭 및 목적지 어드레스(URL)(1307)를 청중 디바이스(1300) 상에 위치된 선택 하이퍼링크 타임 테이블(1304, 1402) 내에 저장하는 단계(2005)와,Storing (2005) the retrieved hyperlink name and destination address (URL) 1307 in the selection hyperlink time tables 1304, 1402 located on the audience device 1300;

· 선택 하이퍼링크 타임 테이블(1502) 내에서, 발표 서버(1507)로부터 검색된 하이퍼링크 명칭(또는 설명) 또는 연관된 목적지 어드레스를 이용하여 하이퍼링크(1501)를 선택하는 단계(2006)와,Selecting (2006) the hyperlink 1501 using the hyperlink name (or description) retrieved from the presentation server 1507 or the associated destination address in the selection hyperlink time table 1502, and

· 청중 디바이스(1500) 상에서 실행되는 브라우저 프로그램을 이용하여 상기 하이퍼링크(1501)를 활성화시키는 단계(2007)와,Activating (2007) the hyperlink 1501 using a browser program running on the audience device 1500,

· 선택된 하이퍼링크(1501)와 연관되는 검색된 목적지 어드레스(1503, 1504)를 이용하여 네트워크(1505)에 접속된 서버(1506) 상에 위치된 정보 및/또는 서비스에 액세스하는 단계(2008)와,Accessing (2008) information and / or services located on a server 1506 connected to the network 1505 using the retrieved destination addresses 1503 and 1504 associated with the selected hyperlink 1501,

· 네트워크(1602) 상의 액세스된 서버(1603)로부터 정보 및/또는 서비스(1604, 1601)를 검색하는 단계(2009)와,Retrieving (2009) information and / or services 1604, 1601 from an accessed server 1603 on the network 1602;

· 브라우저 프로그램을 이용하여, 검색된 정보 및/또는 서비스(1604)를 청중 디바이스(1600) 상에 디스플레이하는 단계를 포함한다.Displaying the retrieved information and / or service 1604 on the audience device 1600 using a browser program.

도 13은 청중(1100)의 휴대용 디바이스(1300)를 통신 네트워크(1301)(예를 들면, 인터넷 네트워크)에 접속시키고, 발표 서버(1303) 상의 발표 하이퍼링크 타임 테이블(1302)에 액세스함으로써, 청중 디바이스(1300)의 선택 하이퍼링크 타임 테이블(1304) 상의 정보를 업데이트하는 것을 도시한다.FIG. 13 illustrates an audience by connecting portable device 1300 of audience 1100 to communication network 1301 (eg, an Internet network) and accessing a presentation hyperlink time table 1302 on presentation server 1303. Updating information on the selection hyperlink time table 1304 of the device 1300 is shown.

도 14는 발표 서버(1303) 상의 발표 하이퍼링크 타임 테이블(1401, 1302)을 이용하여 청중 디바이스(1300) 상의 선택 하이퍼링크 타임 테이블(1400)을 업데이트하는 프로세스를 상세히 나타낸다. 기본적으로, 이 프로세스를 이용하여, 청중이 관심이 가는 주제를 선택한 세계 표준 시각(1403)에 활성화된 하이퍼링크 용어( 또는 짧은 명칭 및/또는 짧은 설명) 및 하이퍼링크 용어의 URL은, 발표 서버(1303) 상에 위치된 발표 하이퍼링크 타임 테이블(1401) 내에서 식별되고, 청중 디바이스(1300) 상에 위치된 선택 하이퍼링크 타임 테이블(1402) 내에 복사된다.14 details the process of updating the selection hyperlink time table 1400 on the audience device 1300 using the presentation hyperlink time tables 1401 and 1302 on the presentation server 1303. Basically, using this process, the active hyperlink term (or short name and / or short description) and the URL of the hyperlink term are selected at the presentation server ( It is identified within the presentation hyperlink time table 1401 located on 1303 and copied into the selection hyperlink time table 1402 located on the audience device 1300.

도 15는 청중 디바이스(1500)로부터 발표 동안에 청중이 선택한 주제에 대해 하이퍼링크된 웹 페이지를 선택하고 해당 웹 페이지로의 액세스를 획득하는 것을 도시한다. 기본적으로, 청중은 업데이트된 선택 하이퍼링크 타임 테이블(1502) 상에서 하이퍼링크 용어(1501)를 가리키고 선택하며, 청중 디바이스(1500) 상의 소프트웨어를 이용하여, 웹 브라우저를 활성화하고, 선택된 항목(1501)의 URL(1503)에 대한 하이퍼링크를 트리거(trigger)한다. 이 도면에 도시된 예에서, 청중은 "엘라직 산"이라는 하이퍼링크 용어(사실상, 와인 내에서 발견되는 화학적 화합물임)를 선택하고, URL(1504), 즉 http://www.hopeforcancer.com/ellagicacid.htm을 가리키는 하이퍼링크를 트리거한다.FIG. 15 illustrates selecting a hyperlinked web page for a topic selected by an audience during an announcement from an audience device 1500 and gaining access to that web page. Basically, the audience points and selects the hyperlink term 1501 on the updated selection hyperlink time table 1502, activates the web browser using software on the audience device 1500, and selects the selected item 1501. Triggers a hyperlink to URL 1503. In the example shown in this figure, the audience selects the hyperlink term "elazic acid" (which is, in fact, a chemical compound found in wine), and URL 1504, i.e., http://www.hopeforcancer.com Trigger a hyperlink that points to /ellagicacid.htm.

도 16은 선택된 하이퍼링크 용어(1501)(예를 들면, "엘라직 산" 등)와 연관된 웹 페이지(1601)(예를 들면, ellagicacid.htm의 문서 등)가, 액세스된 웹 서버(1603)(예를 들면, http://www.hopeforcancer.com)로부터 네트워크(1602)를 통해 수신되고, 청중 디바이스(1600) 상에서 디스플레이(1604) 또는 재생되는 것을 도시한다.16 illustrates a web server 1603 to which a web page 1601 (eg, a document of ellagicacid.htm, etc.) associated with the selected hyperlink term 1501 (eg, "Ellagic Acid", etc.) is accessed. Received via network 1602 from http://www.hopeforcancer.com (eg, http://www.hopeforcancer.com) and shown on display 1604 or playback on audience device 1600.

본 발명은 바람직한 실시예를 참조하여 특히 도시되고 설명되어 있으나, 본 발명의 정신 및 범주를 벗어나지 않으면서 해당 실시예의 형태 및 세부 사항에 대한 여러 변형예가 존재할 수 있다는 것을 이해할 수 있을 것이다.While the invention has been particularly shown and described with reference to preferred embodiments, it will be understood that various modifications may be made to the form and details of the embodiments without departing from the spirit and scope of the invention.

Claims

A method of generating a Speech Hyperlink-Time table in connection with a universal time system that provides absolute time.

Providing a presentation hyperlink table comprising a plurality of entities, each of the plurality of entities comprising a hyperlinked term and a corresponding network address of a network, the network address associated with the hyperlink term Linking the hyperlink terminology to information, the information being on a server in the network;

While the presenter is presenting, recognizing each hyperlink term in the presentation hyperlink table pronounced by the presenter, wherein the recognition step is performed by a speech recognition system on a computing device;

For each recognized hyperlink term,

Determining a universal time in which the hyperlink term was recognized;

Identifying a network address corresponding to the recognized hyperlink term from the published hyperlink table;

Generating a record in the announcement hyperlink time table comprising the universal time, the recognized hyperlink term and the network address corresponding to the recognized hyperlink term.

How to create a presentation hyperlink timetable.

Claim 2 was abandoned when the setup registration fee was paid.

The method of claim 1,

The universal time system is derived from a Global Positioning System (GPS) time system, a Universal Time Co-ordinated time (UTC time) system, a Greenwich Mean Time (GMT) system, and a free-running atomic clock of GPS satellites. One is selected from a group of visual systems

How to create a presentation hyperlink timetable.

The method of claim 1,

Determining the universal time includes determining the universal time by a GPS receiver

How to create a presentation hyperlink timetable.

The method of claim 1,

Providing the presentation hyperlink table,

Selecting the hyperlink term from the presentation;

For each selected hyperlink term,

Identifying a network address corresponding to the selected hyperlink term, and storing an entity comprising the selected hyperlink term and the identified corresponding network address in the presentation hyperlink table.

How to create a presentation hyperlink timetable.

Claim 5 was abandoned upon payment of a set-up fee.

The method of claim 1,

The network is the Internet, the network address is a URL (Universal Resource Locator), the information is a web page, and the server is a web server.

How to create a presentation hyperlink timetable.

delete

The method of claim 1,

Prior to the recognition step, further comprising training the speech recognition system to recognize the hyperlink term when the hyperlink term is pronounced during the presentation.

How to create a presentation hyperlink timetable.

Claim 8 was abandoned when the registration fee was paid.

The method of claim 1,

The announcement may be included in a radio program or television program.

How to create a presentation hyperlink timetable.

The method of claim 1,

The determining, verifying and generating step is executed by a computer program consisting of instruction code that is executed on the computing device

How to create a presentation hyperlink timetable.

delete

As a method of processing a presentation in relation to a universal time system that provides absolute perspective,

Inputting at least one selection command on an auditor device, wherein the audience device is a computing device, each selection command being input in real time in response to a hyperlink term pronounced at the presentation, each pronounced The hyperlink term appears in a record of one of a plurality of records of a presentation hyperlink time table included in a presentation server, wherein each record in the presentation hyperlink time table is associated with the hyperlink term of the presentation and the hyperlink term during the presentation. A network address linking said hyperlink term to information relating to said universal time and said hyperlink term, said information being on a server of a network;

For each selection command entered, determining the universal time in which the selection command was entered and recording the determined universal time in a record of a selections Hyperlink-Time table included in the audience device. Containing

How to handle a presentation.

The method of claim 12,

The audience device is coupled to the presentation server via the network,

Selecting at least one universal time from the selection hyperlink time table;

For each selected universal time,

Delivering the selected universal time to the presentation server;

Receiving a hyperlink term and an associated network address appearing in a record of the announcement hyperlink timetable from the announcement server, wherein the universal time included in the announcement hyperlink time table is closest to the selected universal time and the selected universal time. Not exceeding-with,

Storing the received hyperlink term and associated network address in the selected hyperlink time table that includes the selected universal time.

How to handle a presentation.

The method of claim 13,

Selecting a record of the selection hyperlink time table;

Connecting to information related to a hyperlink term in said selected record using a network address in said selected record;

Retrieving the information from a server in the network including the information;

Displaying the retrieved information on the audience device;

How to handle a presentation.

Claim 15 was abandoned upon payment of a registration fee.

The method of claim 12,

The universal time system is selected from the group consisting of a GPS vision system, a Coordinated Universal Time (UTC) system, a Greenwich Mean Time (GMT) system, and a vision system derived from a free-driven atomic clock of GPS satellites.

How to handle a presentation.

Claim 16 was abandoned upon payment of a setup registration fee.

The method of claim 12,

Determining the universal time includes determining the universal time by a GPS receiver at the audience device.

How to handle a presentation.

Claim 17 was abandoned upon payment of a registration fee.

The method of claim 12,

The audience device is selected from the group consisting of workstations, portable computers, PDAs, smartphones and other types of portable computing devices.

How to handle a presentation.

delete

Include audience devices that process presentation in relation to a universal time system that provides absolute perspective,

The audience device is a computing device having a computer program therein, wherein the computer program comprising instruction code for implementing the method of any one of claims 12 to 17 is executed.

Computing system.

When executed, a computer program comprising instruction code for performing the method of claim 1 or 12 is recorded.

Computer-readable recording media.