KR20010089293A

KR20010089293A - Method and apparatus to prepare listener-interest-filtered works

Info

Publication number: KR20010089293A
Application number: KR1020017004514A
Authority: KR
Inventors: 도날드 제이 주니어 헤즈나
Original assignee: 도날드 제이 주니어 헤즈나; 이나운스 인코포레이티드
Priority date: 1998-10-09
Filing date: 1999-10-07
Publication date: 2001-09-29
Also published as: DE69937001T2; US20020082828A1; JP2003510625A; WO2000022611A1; US20030046080A1; US20080140414A1; DE69937001D1; ATE371927T1; EP1125287B1; AU6296599A; WO2001020596A1; EP1125287A4; US6374225B1; US20050033584A1; US8452589B2; US7299184B2; US20130253921A1; US6801888B2; US7043433B2; US20160259541A1

Abstract

An embodiment of the present invention is a method of presenting an audio or audio-visual work including: (a) a user selecting a speed contour from one or more speed contours stored in a database apparatus; (b) presenting the audio or audio-visual work at a playback device using the user selected speed contour to provide presentation rates; (c) during presenting, the user selecting another speed contour from the one or more speed contours stored in the database apparatus; and (d) continuing to present the audio or audio-visual work at the playback device using the user selected another speed contour to provide presentation rates.

Description

METHOD AND APPARATUS TO PREPARE LISTENER-INTEREST-FILTERED WORKS}

현재 공지되어 있는 시간척도변환 (Time-Scale Modification, 이하 TSM) 방법은, 구술 문장의 식별된 분절 속도, 즉 말하기 속도가 재생중 다이나믹하게 변환될 수 있도록 디지털 레코드 오디오을 변환할 수 있게 해준다. 그러한 TSM 방법의 전형적인 적용예로서는, 장인을 위한 속독, 말하는 책, 디지털 녹음 강의, 슬라이드 쇼, 멀티미디어 프리잰테이션 및 외국어 학습 등이며, 이에 제한되지는 않는다. 그러한 적용예에서는, 청취자가 미리 녹음된 화자의 소리를 재생하는 동안 말하기 속도를 제어할 수 있으며, 이를 "청취자 지시에 따르는 시간척도변환 애플리케이션(이하, LD-TSM (Listener-Directed Time-Scale Modification application)) 이라고 하고 있다. 이것은 청취자로 하여금 분절 속도 및 미리 녹음된 화자 소리의 정보전달속도를 느리게 또는 빠르게 할 수 있도록 해준다. 당업자에게 공지된 바와 같이, 전술한 LD-TSM 애플리케이션에서는 TSM 방법을 사용하여, 증가 또는 감소된 재생속도에서 이해할 수 있도록 제공되어야할 음성 또는 오디오를 가속 혹은 감속 할 수 있다. 따라서, 예컨대 청취자는 가속 (fast-forwarding) 을 통해 자료를 쉽게 인지할 수 있다.The presently known Time-Scale Modification (TSM) method allows digital record audio to be transformed so that the identified segmentation rate of the oral sentence, ie speech rate, can be converted dynamically during playback. Typical applications of such TSM methods include, but are not limited to, speed reading for craftsmen, speaking books, digital recording lectures, slide shows, multimedia presentations, and foreign language learning. In such an application, the listener can control the speech rate while playing the prerecorded speaker's sound, which is referred to as a "Listener-Directed Time-Scale Modification application (LD-TSM)". This allows the listener to slow down or speed up the segmentation rate and the information transfer rate of the prerecorded speaker's sound.As well known to those skilled in the art, the aforementioned LD-TSM application uses the TSM method. For example, it is possible to accelerate or decelerate voice or audio that must be provided for understanding at increased or decreased playback speeds, so that the listener can easily perceive the material through fast-forwarding, for example.

전형적인 LD-TSM 시스템에서는, 다수의 방법으로 청취자로부터의 입력이 지정될 수 있다. 예컨대, 키 (버튼) 의 누름, 마우스 동작, 또는 목소리 명령 등의 사용을 통해 입력이 지정될 수 있으며, 이하 모두 "키의 입력" 이라 한다. 그 결과, LD-TSM 시스템에서는 청취자가 디지털 오디어 미디어의 정보전달속도를관심도에 맞게, 이해 속도에 맞게, 조절할 수 있게 할 수 있음을 쉽게 이해할 수 있다.In a typical LD-TSM system, input from a listener can be specified in a number of ways. For example, input may be specified through the use of a key (button) press, a mouse action, or a voice command, all of which are hereinafter referred to as "key input". As a result, it is easy to understand that the LD-TSM system allows the listener to adjust the information transmission speed of the digital audio media according to the interest and the understanding speed.

전술한 바에서 쉽게 알 수 있듯이, LD-TSM 시스템을 최적으로 사용하기 위해서는, TSM 을 제공하는 오디어 미디어와 청취자와의 연동 동작방법을 결정할 필요가 있다. 특히, 청취자에 의해 선택되는 실제 정보전달속도는, 화자의 명료성, 특정분야의 청취자의 관심도, 특정분야의 청취자의 친숙도, 청취자가 콘탠츠를 받아적는지, 청취자가 자료의 내용을 수신하는데 할당한 통상적인 시간량 등과 같은 다양한 인자에 의존한다.As can be easily seen from the above, in order to optimally use the LD-TSM system, it is necessary to determine a method of interworking with the audio media and the listener providing the TSM. In particular, the actual information transmission speed selected by the listener is determined by the speaker's clarity, the listener's interest in a particular field, the familiarity of the listener in a particular field, whether the listener complies with the content, and the listener receives the content of the material. It depends on various factors such as one conventional amount of time.

음성 및/또는 오디오의 부분들에 대한 청취자의 관심도를 결정하는 종래의 방법은 고유적으로 부정확하다. 특히, 이러한 방법들은 예컨대 버튼 누름에 의해 만들어지는 카셋트 테잎의 빨리감기 및 되감기 패턴을 검출하는 것을 포함한다. 그러한 빨리감기 또는 되감기 패턴을 사용하는 것은 각종의 단점을 야기한다. 예컨대, 청취자는 빨리감기 또는 되감기동안 이해가 잘 되지 않거나, 정보가 제공되지 않기 때문에, 특정 오디오 자료의 부분에 대하여 빨리감기와 되감기 사이를 자주 교번한다. 특히, 재생위치가 진행되는 경우에는, 오디오 자료를 통해 진행해 가는 동안 재생을 인터럽트 하기도 하고, 이해할 수 없는 오디오 자료의 버젼을 제공하기도 한다 (예컨대, 속도증가에 대하여 다람쥐 소리 (chipmunk like sound) 등). 그렇듯이, 현재의 청취자 관심도를 결정하는 방법은 최적의 정보전달속도를 결정하는데는 별로 쓸모가 없다.Conventional methods of determining the listener's interest in parts of speech and / or audio are inherently inaccurate. In particular, these methods include detecting the fast-forward and rewind patterns of cassette tapes made by, for example, button presses. Using such a fast forward or rewind pattern causes various disadvantages. For example, the listener frequently alternates between fast-forward and rewind for a particular piece of audio material because it is not understood or information is not provided during fast-forward or rewind. In particular, when the playback position progresses, it may interrupt playback while progressing through the audio material, or provide a version of the audio material that is incomprehensible (e.g. chipmunk sound for speed increase). . As such, the method of determining the current listener's interest is of little use in determining the optimal information delivery rate.

전술한 바에서 쉽게 이해할 수 있듯이, 종래 기술분야에서는, 음성, 오디오,및/또는 오디오-비쥬얼 처리의 부분들에 대한 청취자 관심도를 결정하는 방법 및 장치에 대한 요구가 있다. 또한, 종래 기술분야에서는, 청취자 관심도의 결정에 따라 음성, 오디오 및/혹은 오디오-비쥬얼 처리를 재생하여 청취자 관심도 필터링 도구 (LIF 도구) 를 제공하는 방법 및 장치에 대한 요구가 있다.As can be readily understood from the foregoing, there is a need in the art for a method and apparatus for determining listener interest for portions of voice, audio, and / or audio-visual processing. There is also a need in the prior art for a method and apparatus for providing a listener interest filtering tool (LIF tool) by reproducing voice, audio and / or audio-visual processing in accordance with the determination of listener interest.

본 발명은 음성, 오디오 및 오디오-비쥬얼 처리 (audio-visual works) 분야에 관한 것이다. 특히, 본 발명은 음성, 오디오, 및/또는 오디오-비쥬얼 처리의 부분들에 대하여 요구되는 재생 (playback) 속도에 관련된 청취자 입력을 수신하는 방법 및 장치, 및 청취자 입력을 나타내는 "속도 등고선(Spped Contour)" 또는 "컨셉 속도관계 (Conceptual Speed Association) 자료구조" 를 개발 (developing) 하는 방법 및 장치에 관한 것이다. 청취자 입력은 음성, 오디오 및/또는 오디오-비쥬얼 처리에 대한 청취자의 관심도 및/또는 청취자의 이해 정도 (및/또는 받아쓰기 정도) (이하 "청취자의 관심도" 이라 한다) 에 대한 프록시 (proxy) 로 기능한다. 예컨대, 청취자가 음성, 오디오, 및/또는 오디오-비쥬얼 처리의 몇몇 부분을 더 즐기고 싶다거나, 그 부분을 이해하기가 어렵다거나, 그 부분의 정보를 받아적고자 (transcribing) 했다면, 속도를 늦추고 싶어할 것이다. 특히, 본 발명은 "청취자 관심도 필터링 도구 (listener-interest-filtered work, 이하 LIF 도구)" 를 생성하도록, 속도 등고선 또는 컨셉 속도관계 자료구조에 따라 음성, 오디오 및/또는 소리-비쥬얼 도구를 재생하는 방법 및 장치에 관한 것이다. LIF 도구는 예컨대 교육, 광고, 뉴스 전달, 엔터테인먼트, 공공 안전 방송 등과 같은 수많은 적용에 있어 유용하다.The present invention relates to the field of voice, audio and audio-visual works. In particular, the present invention provides a method and apparatus for receiving a listener input related to the playback speed required for portions of voice, audio, and / or audio-visual processing, and a "Spped Contour" representing the listener input. ) Or "Conceptual Speed Association data structure". The listener input functions as a proxy for the listener's interest in voice, audio and / or audio-visual processing and / or the listener's understanding (and / or dictation) (hereinafter referred to as "listener's interest"). do. For example, if a listener wants to enjoy some more of the voice, audio, and / or audio-visual processing, is difficult to understand the part, or wants to transcrib information about the part, he or she would like to slow down. something to do. In particular, the present invention provides a method for reproducing speech, audio and / or sound-visual tools in accordance with velocity contours or concept velocity relationship data structures to create a "listener-interest-filtered work" (LIF tool). A method and apparatus are disclosed. LIF tools are useful for numerous applications, such as education, advertising, news delivery, entertainment, public safety broadcasting, and the like.

도 1 은 오디오 또는 오디오-비쥬얼 처리에 대한 속도 등고선을 발생시키는 본 발명의 제 1 실시예의 블록도.1 is a block diagram of a first embodiment of the present invention for generating velocity contours for audio or audio-visual processing.

도 2 는 도 1 에 도시된 속도 등고선 발생기의 일실시예에 사용되는 알고리즘의 플로 차트.FIG. 2 is a flow chart of the algorithm used in one embodiment of the speed contour generator shown in FIG.

도 3 은 동일한 오디오 또는 오디오-비쥬얼 처리의 몇몇 다른 청취 세션에 대한 속도 등고선을 도해적으로 나타낸 도면.3 graphically illustrates velocity contours for several different listening sessions of the same audio or audio-visual process.

도 4 는 동일한 오디오 또는 오디오-비쥬얼 처리의 몇몇 다른 청취 세션에 대하여 사용자가 지정한 TSM 율 또는 재생속도의 생성된 속도 등고선의 그래프 표현을 나타낸 도면.4 shows a graphical representation of the generated rate contours of a user specified TSM rate or playback rate for several different listening sessions of the same audio or audio-visual process.

도 5 는 오디오 또는 오디오-비쥬얼 처리에 대한 속도 등고선을 발생시키는 본 발명의 제 2 실시예의 블록도로서, 오디오 또는 오디오-비쥬얼 처리의 사용자 입력 및 워드 맵 (word map) 은 속도 등고선을 제공하는데 사용된다.5 is a block diagram of a second embodiment of the present invention for generating velocity contours for audio or audio-visual processing, wherein user input and word maps of the audio or audio-visual processing are used to provide velocity contours. do.

도 6 은 오디오 또는 오디오-비쥬얼 처리에 대한 소리 파형과 해당 텍스트를 디스플레이한 2 차원 그래프를 나타낸 도면.6 shows a two-dimensional graph displaying sound waveforms and corresponding text for audio or audio-visual processing.

도 7 은 오디오 또는 오디오-비쥬얼 처리를 택스트화 한 디스플레이를 나타낸 도면.7 shows a display with text or audio-visual processing texturized.

도 8 은 오디오 또는 오디오-비쥬얼 처리에 대한 컨셉 속도관계 자료구조 (CSA DS) 를 발생시키는 본 발명의 제 3 실시예의 블록도.8 is a block diagram of a third embodiment of the present invention for generating a Concept Velocity Relationship Data Structure (CSA DS) for audio or audio-visual processing.

도 9 는 도 8 에 나타낸 CSADS 발생기의 일실시예에 사용되는 알고리즘의 플로차트.9 is a flowchart of an algorithm used in one embodiment of the CSADS generator shown in FIG.

도 10 은 오디오 또는 오디오-비쥬얼 처리와 연계하여 속도 등고선을 활용하여 LIF 도구를 생성하는 본 발명의 제 4 실시예를 나타낸 블록도.10 is a block diagram illustrating a fourth embodiment of the present invention for generating a LIF tool utilizing velocity contours in conjunction with audio or audio-visual processing.

도 11 은 오디오 또는 오디오-비쥬얼 처리와 연계하여 CSA 자료구조를 활용하여 LIF 도구를 생성하는 본 발명의 제 5 실시예의 블록도.11 is a block diagram of a fifth embodiment of the present invention for generating a LIF tool utilizing CSA data structures in conjunction with audio or audio-visual processing.

도 12 는 TSM 율 또는 재생 속도를 재공하기 위하여 도 11 의 TSM 율 조정기 (arbiter) 의 일실시예에 사용되는 알고리즘의 플로차트.12 is a flowchart of an algorithm used in one embodiment of the TSM rate adjuster of FIG. 11 to provide a TSM rate or playback rate.

발명의 요약Summary of the Invention

본 발명의 실시예들은 전술한 종래 기술분야의 요구들을 용이하게 만족시키며, 음성, 오디오, 및/또는 오디오-비쥬얼 처리의 부분들에 대한 청취자 관심도를 결정하는 장치 및 방법, 및 청취자 관심도의 척도를 나타내는 속도 등고선 또는 컨셉 속도관계 자료구조를 개발하는 방법 및 장치를 제공한다. 또한, 본 발명의 다른 실시예들에서는, 청취자 관심도 필터링 도구 (LIF 도구) 를 제공하도록, 속도 등고선 또는 컨셉 속도관계 자료구조를 활용하여, 속도 등고선 또는 컨셉 속도관계 자료구조에 따라 음성, 오디오 및/또는 오디오-비쥬얼 처리를 재생하는 방법 및 장치를 제공한다.Embodiments of the present invention readily meet the needs of the prior art described above, and provide an apparatus and method for determining listener interest for portions of voice, audio, and / or audio-visual processing, and a measure of listener interest. Provides a method and apparatus for developing a velocity contour or a conceptual velocity relation data structure. In addition, other embodiments of the present invention utilize a velocity contour or concept velocity relationship data structure to provide a listener interest filtering tool (LIF tool), in accordance with velocity contours or concept velocity relationship data structures, according to speech, audio and / or Or a method and apparatus for reproducing an audio-visual process.

본 발명의 일실시예로서는, 시간척도변환 (TSM) 율을 획득하는데 사용되는 친화도 (affinity) 정보, 및 TSM 율과 관련된 오디오 또는 오디오-비쥬얼 처리의 일부분의 식별자를 획득하는데 사용되는 식별자 정보를 포함하는 속도 등고선을 발생시키는 장치로서, (a) 사용자 정보를 수신하고, 오디오 또는 오디오-비쥬얼 처리의 일 부분의 입력을 지시하는 사용자 입력장치 (b) 부분의 식별자, 부분, 및 TSM 율에 응답하여, 시간척도변환된 부분을 발생시키는 시간척도변환 시스템 (c) 사용자 정보, 부분의 식별자, 부분에 응답하여, TSM 율 및 그 TSM 율에 관련된 부분의 식별자를 발생시키는 시간척도변환 모니터 및 (d) TSM 율 및 관련된 부분의 식별자에 응답하여, 속도 등고선을 발생시키는 속도 등고선 발생기를 포함하는 장치이다.One embodiment of the present invention includes affinity information used to obtain a time scale conversion (TSM) rate, and identifier information used to obtain an identifier of a portion of the audio or audio-visual processing associated with the TSM rate. A device for generating a speed contour, the apparatus comprising: (a) receiving user information and responsive to the identifier, portion, and TSM rate of the portion of the user input device (b) indicating the input of the portion of the audio or audio-visual processing; A timescale conversion system for generating a timescaled portion; (c) a timescale conversion monitor for generating a TSM rate and an identifier of the portion related to the TSM rate in response to the user information, the identifier of the portion, and (d) And a speed contour generator for generating a speed contour in response to the TSM rate and the identifier of the associated portion.

본 발명의 다른 실시예로서는, TSM 율를 획득하는데 사용되는 친화도 정보, 및 TSM 율과 관련된 오디오 또는 오디오-비쥬얼 처리의 일 부분에 대한 컨셉 식별자를 획득하는데 사용되는 컨셉 정보를 포함하는 컨셉 속도관계 자료구조를 발생시키는 장치로서, (a) 사용자 정보를 수신하고, 오디오 또는 오디오-비쥬얼 처리의 일 부분의 입력을 지시하는 사용자 입력장치 (b) 부분의 식별자, 부분, 및 TSM 율에 응답하여, 시간척도변환 된 부분을 발생시키는 시간척도변환 시스템 (c) 부분 의 식별자 및 부분에 응답하여 부분에 대한 컨셉을 발생시키는 컨셉 디코더 (d) 사용자 정보 및 컨셉에 응답하여, TSM 율 및 TSM 율에 관련된 컨셉 식별자를 발생시키는 시간척도변환 컨셉 모니터 및 (e) TSM 율 및 관련된 컨셉 식별자에 응답하여, 컨셉 속도관계 자료구조를 발생시키는 컨셉 속도관계 자료구조 발생기를 포함하는 장치이다.In another embodiment of the present invention, a concept velocity relationship data structure comprising affinity information used to obtain a TSM rate, and concept information used to obtain a concept identifier for a portion of audio or audio-visual processing associated with the TSM rate. A device for generating an apparatus comprising: (a) a timescale in response to an identifier, a portion, and a TSM rate of a portion of a user input device (b) that receives user information and directs input of a portion of audio or audio-visual processing; (C) a concept decoder generating a concept for the part in response to the part identifier and part; (d) a concept identifier relating to the TSM rate and the TSM rate, in response to the user information and concept. (E) generate a concept velocity relationship data structure in response to the time-scaled conversion concept monitor and (e) the TSM rate and the associated concept identifier. Is a device that includes a concept velocity-related data structure generator.

본 발명의 또 다른 실시예로서는, TSM 율를 획득하는데 사용되는 친화도 정보, 및 TSM 율과 관련된 오디오 또는 오디오-비쥬얼 처리의 일 부분의 식별자를 획득하는데 사용되는 식별자 정보를 포함하는 속도 등고선과 연계하여 오디오 또는 오디오-비쥬얼 처리를 재생하는 장치로서, (a) 오디오 또는 오디오-비쥬얼 처리의 일 부분의 입력을 지시하는 입력장치 (b) 부분의 식별자, 부분, 및 TSM 율에 응답하여 시간척도변환 된 부분을 발생시키는 시간척도변환 시스템 (c) 시간척도변환된 부분에 응답하여, 시간척도변환 된 부분을 재생하는 재생장치, 및 (d) 속도 등고선 및 부분의 식별자에 응답하여, TSM 율를 발생시키는 시간척도변환 속도 결정기를 포함하는 장치이다.In another embodiment of the present invention, audio is associated with a velocity contour that includes affinity information used to obtain a TSM rate, and identifier information used to obtain an identifier of a portion of audio or audio-visual processing associated with the TSM rate. Or an apparatus for reproducing audio-visual processing, comprising: (a) an input device indicating input of a portion of audio or audio-visual processing; (b) an identifier, a portion, and a timescaled portion in response to a TSM rate. (C) a playback apparatus for reproducing the time-scaled portion in response to the time-scaled portion, and (d) a time scale for generating the TSM rate in response to the velocity contours and the identifier of the portion. It is a device including a conversion rate determiner.

본 발명의 다른 실시예로서는, TSM 율를 획득하는 사용되는 친화도 정보 및 TSM 율과 관련된 오디오 또는 오디오-비쥬얼 처리의 일 부분의 컨셉 식별자를 획득하는데 사용되는 컨셉 정보를 포함하는 컨셉 속도관계 (CSA) 자료구조와 연계하여 오디오 또는 오디오-비쥬얼 처리를 재생하는 장치로서, (a) 오디오 또는 오디오-비쥬얼 처리의 일 부분의 입력을 지시하는 입력장치 (b) 부분의 식별자, 부분, 및 TSM 율에 응답하여, 시간척도변환 된 부분을 발생시키는 시간척도변환 시스템 (c) 시간척도변환 된 부분에 응답하여, 시간척도변환 된 부분을 재생시키는 재생장치 (d) 부분의 식별자 및 부분에 응답하여, 부분에 대한 컨셉을 발생시키는 컨셉 디코더 (e) 컨셉과 컨셉 속도관계 자료구조에 응답하여, TSM 율를 발생시키는 시간척도변환 컨셉 룩업을 포함하는 장치이다.In another embodiment of the present invention, a concept velocity relationship (CSA) material comprising affinity information used to obtain a TSM rate and concept information used to obtain a concept identifier of a portion of audio or audio-visual processing associated with the TSM rate. A device for reproducing audio or audio-visual processing in association with a structure, the apparatus comprising: (a) an input device indicating input of a portion of the audio or audio-visual processing (b) in response to an identifier, portion, and TSM rate of the portion; And (c) a reproducing apparatus for reproducing the timescaled portion, in response to the timescaled portion, in response to the identifier and portion of the portion, Concept Decoder Generating Concepts (e) A field containing a timescale transformation concept lookup that generates a TSM rate in response to the concept and concept velocity relation data structures. A.

본 발명의 다른 실시예로서는, 오디오 또는 오디오-비쥬얼 처리에 대한 청취자 관심도 필터링 도구를 발생시키는 방법으로서, (a) 1 이상의 사용자 카테고리에 대한 1 이상의 오디오 또는 오디오-비쥬얼 처리에 대하여 1 이상의 평균 속도 등고선을 발생시키는 단계 (b) 1 이상의 평균 속도 등고선을 1 이상의 컨셉 속도관계 자료구조로 변환하는 단계 (c) 1 이상의 컨셉 속도관계 자료구조로부터 청취자 관심도 필터링 된 컨셉 속도관계 자료구조를 형성하는 단계를 포함하는 방법이다. 본 실시예는, 청취자 관심도 필터링 된 컨셉 속도관계 자료구조를 사용하여 청취자관심도 필터링 된 오디오 또는 오디오-비쥬얼 처리를 생성하는 단계, 또는 청취자 관심도 필터링 된 컨셉 속도관계 자료구조를 청취자 관심도 필터링 된 속도 등고선으로 변환하는 단계, 또는 청취자 관심도 필터링 된 속도 등고선을 사용하여 청취자 관심도 필터링 된 오디오 또는 오디오-비쥬얼 처리를 생성하는 단계, 또는 전술한 어느 것을 사용하여 청취자의 관심도, 바람직한 전송 속도, 또는 오디오 또는 오디오-비쥬얼 처리의 컨셉 혹은 자료에 대한 청취자 친화도를 결정하는 단계를 더 포함한다.In another embodiment of the present invention, there is provided a method of generating a listener interest filtering tool for audio or audio-visual processing, comprising: (a) generating at least one average velocity contour for at least one audio or audio-visual processing for at least one user category; (B) converting one or more average velocity contours into one or more conceptual velocity relationship data structures; and (c) forming a conceptual velocity relation data structure from which the listener interest is filtered from the one or more conceptual velocity relationship data structures. It is a way. In this embodiment, a listener interest filtered filtered concept velocity relationship data structure is generated using a listener interest filtered concept velocity relationship data structure, or the listener interest filtered concept velocity relationship data structure is used as a listener interest filtered velocity contour. Converting, or generating listener interest filtered audio or audio-visual processing using listener interest filtered rate contours, or using any of the foregoing, listener interest, desired transmission rate, or audio or audio-visual Determining a listener's affinity for the concept or material of the process.

본 발명의 실시예는 음성, 오디오, 및/또는 오디오-비쥬얼 처리의 부분들에 대해 요구되는 재생속도에 관련된 청취자 입력을 수신하고, 청취자 입력을 나타내는 "속도 등고선" 또는 "컨셉 속도관계 자료구조" 를 개발해가는 방법 및 장치에 관한 것이다. 청취자 입력은 음성, 오디오 및/또는 오디오-비쥬얼 처리에 대한 청취자의 관심과 청취자의 이해 정도 (이하 "청취자 관심도" 이라 한다) 에 대한프록시 (proxy) 로서 기능한다. 예컨대, 청취자가 음성, 오디오, 및/또는 오디오-비쥬얼 처리의 몇몇 부분을 더 즐기고 싶다거나, 그 부분을 이해하기가 어려웠다면, 속도를 늦추고 싶어할 것이다. 본 발명의 다른 실시예에서는 속도 등고선 또는 컨셉 속도관계 자료구조에 따라 음성, 오디오 및/또는 소리-비쥬얼 처리 (works) 를 재생하여, "청취자 관심도 필터링 도구 (LIF 도구)" 라고 하는 새로운 처리도구 (works) 를 생성하는 방법 및 장치에 관한 것이다. 이하 상술되는 바와 같이, LIF 도구는 예컨대 교육, 광고, 뉴스 전달, 공공 안전 방송 등과 같은 수많은 적용에 있어 유용하다.Embodiments of the present invention receive a listener input related to the playback rate required for portions of voice, audio, and / or audio-visual processing, and include a "velocity contour" or "concept velocity relationship data structure" representing the listener input. It relates to a method and apparatus for developing. The listener input acts as a proxy for the listener's interest in voice, audio and / or audio-visual processing and for the listener's understanding (hereinafter referred to as "listener interest"). For example, if the listener wants to enjoy some more of the voice, audio, and / or audio-visual processing, or if it is difficult to understand that part, they will want to slow down. In another embodiment of the present invention, a speech, audio and / or sound-visual process is played in accordance with a velocity contour or a conceptual velocity relationship data structure, thereby creating a new processing tool called a "listener interest filtering tool (LIF tool)." works and a method and apparatus for generating the same). As detailed below, the LIF tool is useful in numerous applications such as education, advertising, news delivery, public safety broadcasting, and the like.

속도 등고선 및 컨셉 속도관계 자료구조의 발생Generation of velocity contours and concept velocity-related data structures

본 발명에 따르면, 본 발명의 제 1 실시예에서는 속도 등고선을 발생시키며, 이 속도 등고선은 선택적으로 이후의 사용을 위해 저장된다.According to the invention, in a first embodiment of the invention a speed contour is generated, which speed contour is optionally stored for later use.

도 1 은 오디오 또는 오디오-비쥬얼 처리의 속도 등고선을 발생시키는 본 발명의 제 1 실시예 (1000) 의 블록도를 나타낸다. 도 1 에 도시된 바와 같이, 실시예 (1000) 는 사용자로부터 입력을 수신하는 사용자 인터페이스 (UI) (100) 를 포함한다. UI (100) 는 사용자로부터의 입력을 지시하는 출력신호를 제공한다. 사용자 입력은 실시예 (1000) 의 사용자 입력 프로세서/재생 제어기 (UIP/PC) (200) 에 의해 인터프리트 되어 사용자에 의해 선택되는 다음의 옵션을 지시한다. (a) 재생할 파일의 선택, 파일은 특정의 오디오 또는 오디오-비쥬얼 처리이다 (선택된 파일은 실시예 (1000) 로 직접적으로 입력될 수도 있고, 실시예 (1000) 에 저장되어 있는 파일일 수도 있다) (b) 선택된 파일의 재생 시작 (c) 선택된 파일의재생을 홀트 (halt) 시킴 (d) 선택된 파일의 재생을 일시중지 (pause) (e) 재생되고 있는 오디오 또는 오디오-비쥬얼 처리의 일 부분의 TSM 율, 즉 재생속도를 변환시킴, 또는 (f) 속도 등고선을 발생시키는데 대하여 이하 상술하는 방법에서 사용되는 파라미터 Interval_Size, Speed_Change_Resolution, Average_or_Overwrite, 및 Log_Repeats 를 지정. 사용자로부터 입력을 수신하는 장치로는 기술분야의 당업자에게 공지된 많은 장치가 있다. 예컨대, (a) 키의 누름 (b) 마우스상의 스위치의 작동 (c) 슬라이더 또는 위치 인디케이터의 이동 및 (d) 사용자 소리 명령 등을 검출하는 장치, 그리고, 이에 응답하여, 키누름, 스위치 동작, 슬라이더 또는 위치 인디케이터의 이동, 또는 소리 명령 등을 나타내는 디지털 데이터를 처리장치로 송출하는 장치 등의 상용되는 많은 장치가 있음은 기술분야의 당업자에게 공지되어 있다.1 shows a block diagram of a first embodiment 1000 of the invention for generating speed contours of audio or audio-visual processing. As shown in FIG. 1, embodiment 1000 includes a user interface (UI) 100 that receives input from a user. The UI 100 provides an output signal indicative of an input from a user. The user input is interpreted by the user input processor / playback controller (UIP / PC) 200 of the embodiment 1000 to indicate the next option selected by the user. (a) Selection of a file to play, the file is a specific audio or audio-visual process (the selected file may be directly input into the embodiment 1000 or may be a file stored in the embodiment 1000). (b) Start playback of the selected file (c) Halt playback of the selected file (d) Pause playback of the selected file (e) A portion of the audio or audio-visual processing being played Specify the parameters Interval_Size, Speed_Change_Resolution, Average_or_Overwrite, and Log_Repeats used in the methods described below for converting the TSM rate, ie, playback speed, or generating speed contours. There are many devices known to those skilled in the art for receiving input from a user. For example, (a) pressing a key, (b) operating a switch on a mouse, (c) moving a slider or position indicator, and (d) a device for detecting a user sound command, and, in response, a keypress, a switch action, It is known to those skilled in the art that there are many commercially available devices, such as a device for sending a digital data indicative of movement of a slider or position indicator, or a sound command or the like to a processing device.

UIP/PC (200) 는 UI (100) 로부터 입력을 수신하고, (a) 사용자 명령을 수치로 변환하고, (b) 사용자 명령을 인터프리트 (interprete) 하여 파라미터 값을 설정하고, 속도 등고선의 발생, 사용, 변환 또는 오버라이딩 (overriding) 을 제어하고, 그리고 (c) 스트림 데이터의 요청을 디지털 저장장치 (75) 또는 다른 오디오 또는 오디오-비쥬얼 데이터 소스로 송출함으로써 오디오 또는 오디오-비쥬얼 처리로부터의 데이터 스트림의 액세스와 로딩을 지시한다 (재생 제어를 수행하도록). 디지털 저장장치 (75) 에 있어서는, UIP/PC (200) 가 디바이스상의 파일 시스템에 저장된 오디오 또는 오디오-비쥬얼 처리를 나타내는 디지털 데이터의 파일로의 액세스를 요청할 수 있다. 오디오 또는 오디오-비쥬얼 처리로부터의 데이터 스트림의 액세스와 로딩을 지시하기 위하여, UIP/PC (200) 는 사용자 입력과 디지털 저장장치 (75) 에 저장된 오디오 또는 오디오-비쥬얼 처리를 나타내는 디지털 샘플의 위치를 인터프리트 하여 특정 샘플에서 선택된 파일의 위치에 대한 재생위치를 연산한다.The UIP / PC 200 receives input from the UI 100, (a) converts the user command into a numerical value, (b) interprets the user command to set parameter values, and generates velocity contours. Data from audio or audio-visual processing by controlling the use, conversion, or overriding, and (c) sending requests for stream data to digital storage 75 or other audio or audio-visual data sources. Instructs access and loading of the stream (to perform playback control). In digital storage 75, UIP / PC 200 may request access to a file of digital data representing audio or audio-visual processing stored in a file system on the device. In order to direct access and loading of the data stream from the audio or audio-visual processing, the UIP / PC 200 locates the digital samples representing the user input and audio or audio-visual processing stored in the digital storage 75. Interpret to calculate the playback position relative to the location of the selected file in the particular sample.

디지털 저장장치 (75) 는 다음의 입력을 수신한다 : (a) UIP/PC (200) 로부터의 스트림 데이터의 요청, 및 선택적으로 (b) TSM 서브시스템 (200) 으로부터의 시간척도변환된 출력, 및 선택적으로 (c) 속도 등고선 발생기 (500) 로부터의 속도 등고선을 나타내는 데이터의 스트림. 디지털 저장장치 (75) 는 다음의 출력을 생성한다 : (a) 오디오 또는 오디오-비쥬얼 처리를 나타내는 데이터 스트림 (b) 위치 정보의 스트림, 예컨대 출력되고 있는 데이터 스트림의 파일중의 위치. 디지털 저장장치를 활용하는 기술분야의 당업자에게 공지된 많은 방법이 있으며, 예컨대 범용의 데이터를 저장하고 검색하는 하드디스크 드라이브가 있다.Digital storage 75 receives the following inputs: (a) a request for stream data from UIP / PC 200, and optionally (b) a timescaled output from TSM subsystem 200, And optionally (c) a stream of data representing the velocity contours from velocity contour generator 500. Digital storage 75 produces the following output: (a) a data stream representing audio or audio-visual processing; (b) a stream of location information, such as a location in a file of the output data stream. There are many methods known to those skilled in the art utilizing digital storage devices, such as hard disk drives for storing and retrieving general purpose data.

오디오 또는 오디오-비쥬얼 처리는 전형적으로 디지털 저장장치 (75) 상에 디지털 형태로 저장된다. 예컨대, CD-ROM, 디지털 테잎, 자기 디스크 등의 디지털 저장장치로 사용되는 기술분야의 당업자에게 공지된 상용 가능한 많은 장치가 있다. 디지털 저장장치 (75) 는 기술분야의 당업자에게 공지된 방법에 따라 UIP/PC (200) 로부터 데이터 요청을 수신하여, 오디오 및/또는 오디오-비쥬얼 처리를 나타내는 디지털 샘플의 스트림을 제공한다. 대체 실시예로서는, 오디오 또는 오디오-비쥬얼 처리가 아날로그 형태로 아날로그 저장장치에 저장된다. 대체 실시예에 있어서는, 아날로그 신호의 스트림이 도시되지 않은 아날로그 샘플을디지털 샘플로 변환하는 장치로 입력된다. 목소리 신호 등의 입력 아날로그 신호를 수신하여, 적어도 나이키스트 율 (Nyquist Rate) 로 아날로그 신호를 샘플링하여 디지털 신호의 스트림을 제공하고, 충실도의 손실없이 다시 아날로그 신호로 재변환할 수 있는, 기술분야의 당업자에게 공지된 많은 상용의 장치가 있다. 디지털 샘플은 다음 TSM 서브시스템 (300) 으로 전송된다.Audio or audio-visual processing is typically stored in digital form on digital storage 75. For example, there are many commercially available devices known to those skilled in the art for use as digital storage devices such as CD-ROMs, digital tapes, magnetic disks and the like. Digital storage 75 receives data requests from UIP / PC 200 according to methods known to those skilled in the art to provide a stream of digital samples indicative of audio and / or audio-visual processing. In an alternative embodiment, audio or audio-visual processing is stored in analog storage in analog form. In an alternative embodiment, a stream of analog signals is input to a device for converting unshown analog samples into digital samples. It is possible to receive an input analog signal such as a voice signal, sample an analog signal at least Nyquist Rate, provide a stream of digital signals, and convert it back into an analog signal without loss of fidelity. There are many commercially known devices known to those skilled in the art. The digital sample is then sent to the TSM subsystem 300.

TSM 서브시스템 (300) 은 입력으로서 다음을 수신한다 : (a) 디지털 저장장치 (75) 로부터의 오디오 또는 오디오-비쥬얼 처리의 부분들을 나타내는 샘플의 스트림 (b) 예컨대, 샘플 카운트 또는 시간값 등의 송출되는 샘플의 데이터 스트림의 위치를 확인하는데 사용되는 디지털 저장장치 (75) 로부터의 스트림 위치정보 (c) TSM 모니터 (400) 로부터의 요구되는 TSM 율, 또는 재생 속도. TSM 서브시스템 (300) 으로부터의 출력은 다음의 입력으로 적용된다 : (a) 디지털 아날로그 변환기/ 오디오 및/또는 오디오-비쥬얼 재생장치 (DA/APD) (600), 및 선택적으로 (b) 필요하다면, 시간척도변환된 출력, 즉 LIF 도구의 저장을 위한 디지털 저장장치 (75). DA/APD (600) 는 디지털 샘플을 수신하고 오디오 또는 오디오-비쥬얼 처리를 구축하는 기술분야의 당업자에서 공지된 장치이다. 본 발명에 따르면, TSM 서브시스탬 (300) 의 출력은 TSM 모니터 (400) 로부터 제공되는 그 재생속도가 사용자 입력 TSM 율 요구사항에 대하여 사용자로 피드백을 제공하는 오디오 또는 오디오-비쥬얼 처리를 나타내는 디저털 샘플의 스트림이다. 특히, 사용자는 시간척도변환 된 출력을 듣고 UI (100) 를 사용하여 입력을 더 제공함으로써 TSM 율 또는 재생 속도를 변환할 수 있다. 자세하게는, 사용자가 방금 재생된 오디오또는 오디오-비쥬얼 처리의 일 부분을 가속 혹은 감속 하고자 한다면, 사용자는 UI (100) 를 사용하여 입력을 제공하여 오디오 또는 오디오-비쥬얼 처리를 원하는 부분으로 되감기 하고 변환된 TSM 율 또는 재생속도로 다시 재생할 수 있다. 이러한 방법에서는, 사용자가 오디오 또는 오디오-비쥬얼 처리의 각 부분에 대하여 원하는 TSM 율, 또는 재생 속도를 결정한다. TSM 서브시스템 (300) 은 공지된 TSM 방법에 따라 데이터의 입력 스트림을 변형시켜 출력으로서 시간척도변환 된 신호를 나타내는 샘플의 스트림을 제공한다. 본 발명의 바람직한 일실시예에서는, 사용된 TSM 방법은 미국특허 제 5,175,769 호 ('769 특허) 에 게재된 방법으로서, 이하 참조로 언급되며, 본 발명의 발명자 또한 '769 특허의 공동 발명자이기도 하다. 기술분야의 당업자가 쉽게 이해할 수 있듯이, 실시예 (1000) 가 오디오-비쥬얼 처리에 대한 재생을 제공할 때마다 TSM 서브시스템 (300) 은 비쥬얼 정보를 가속 또는 감속하여, 오디오-비쥬얼 처리의 오디오과 매칭시킨다. 바람직한 실시예에서 이를 행하기 위하여, 비디오 신호는 통상의 기술분야에 공지된 많은 방법중 한 방법에 따라 "프레임-서브샘플화" 또는 "프레임-복제화 (frame-replicated) 화" 되어, 오디오-비쥬얼 처리의 오디오과 비쥬얼 부분 사이의 동기화를 유지한다. 따라서, 오디오을 빠르게 하고자 하고, 샘플링이 더 빠른 속도로 요청되면, 프레임 스트림이 서브샘플, 즉 프레임의 건너뛰기 (skip) 가 행해진다.TSM subsystem 300 receives as input: (a) a stream of samples representing portions of audio or audio-visual processing from digital storage 75 (b) such as, for example, a sample count or time value; Stream location information from the digital storage device 75 used to identify the location of the data stream of the sample to be sent. (C) The required TSM rate, or playback rate, from the TSM monitor 400. The output from the TSM subsystem 300 is applied to the following inputs: (a) Digital Analog Converter / Audio and / or Audio-Visual Playback (DA / APD) 600, and optionally (b) if necessary. Digital storage for storage of time-scaled output, ie LIF tools (75). DA / APD 600 is a device known to those skilled in the art for receiving digital samples and establishing audio or audio-visual processing. According to the present invention, the output of the TSM subsystem 300 is digital indicating that the playback rate provided from the TSM monitor 400 provides audio or audio-visual processing for providing feedback to the user regarding user input TSM rate requirements. A stream of samples. In particular, the user can convert the TSM rate or playback speed by listening to the time-scaled output and providing further input using the UI 100. Specifically, if the user wants to accelerate or decelerate a portion of the audio or audio-visual processing that has just been played, the user can use the UI 100 to provide input to rewind and convert the audio or audio-visual processing to the desired portion. Playback can be resumed at the specified TSM rate or playback speed. In this method, the user determines the desired TSM rate, or playback speed, for each part of the audio or audio-visual process. TSM subsystem 300 transforms an input stream of data according to known TSM methods to provide a stream of samples representing the timescaled signal as output. In one preferred embodiment of the present invention, the TSM method used is the method disclosed in U.S. Patent No. 5,175,769 ('769 patent), which is referred to by reference below, and the inventor of the present invention is also a co-inventor of the' 769 patent. As will be readily appreciated by those skilled in the art, whenever the embodiment 1000 provides playback for audio-visual processing, the TSM subsystem 300 accelerates or decelerates the visual information to match the audio of the audio-visual processing. Let's do it. In order to do this in a preferred embodiment, the video signal is " frame-subsampled " or " frame-replicated " according to one of many methods known in the art, so that the audio-visual Maintain synchronization between the audio and visual parts of the process. Thus, if audio is to be made faster and sampling is requested at a faster rate, then the frame stream is subsampled, i.e. skipping of the frames.

TSM 모니터 (400) 는 다음을 입력으로 수신하여 속도 등고선을 발생시키고 있는 실시예 (1000) 로 가이드한다 : (a) UIP/PC (200) 에 의해 요구되는 TSM 율 또는 재생 속도로 변환된 사용자 입력 (요구되는 TSM 율 또는 재생 속도는 인지되고 있는 입력 오디오 또는 오디오-비쥬얼 처리의 일부분에 대한 TSM 율 또는 재생 속도를 지시할 수 있다) (b) 디지털 저장장치 (75) 로부터의 오디오 또는 오디오-비쥬얼 처리의 부분들을 나타내는 샘플의 스트림 (c) 송출되고 있는 샘플들의 스트림중의 위치를 식별하는데 사용되는 디지털 저장장치로 (75) 부터의 현재 스트림 위치 정보, 예컨대 디지털 저장장치 (75) 로부터 전송되는 일군의 샘플들의 시작부의 샘플 카운트 또는 시간값, 및 (d) UIP/PC (200) 으로부터의 파라미터 Interval_Size 및 Speed_Change_Resolution.The TSM monitor 400 receives the following as inputs and guides it to the embodiment 1000 generating the speed contours: (a) User input converted to the TSM rate or playback speed required by the UIP / PC 200. (The required TSM rate or playback rate may indicate the TSM rate or playback rate for the portion of the input audio or audio-visual processing being recognized.) (B) Audio or audio-visual from digital storage 75 (C) a group of current stream positional information from 75, such as from digital storage 75, to a digital storage device used to identify a location in the stream of samples being sent. A sample count or time value at the beginning of the samples of (d) parameters Interval_Size and Speed_Change_Resolution from UIP / PC 200.

속도 등고선은 예컨대 데이터 스트림 형태의 정보로서, 처리의 모든 또는 몇몇 포인트에 대한 오디오 또는 오디오-비쥬얼 처리에 대해 요구되는 TSM 율 또는 재생 속도를 나타낸다. 실제에 있어서는, 오디오 또는 오디오-비쥬얼 처리에 대하여 요구되는 TSM 율 또는 재생속도를 생성하기 위해 실시예 (1000) 에 요구되는 시간 정밀도 (time resolution) 는 오디오 또는 오디오-비쥬얼 처리를 포함하는 디지털 신호의 샘플링 율과 비교하여 늦게 변동한다. 그 결과, 본 발명의 바람직한 실시예에서는, 속도 등고선이 오디오 또는 오디오-비쥬얼 처리의 특정 세그먼트에 해당하는 특정 샘플 그룹과 관련되는 단일의 TSM 값을 구비한다. 대체예로서, 입력 오디오-비쥬얼 처리의 각 샘플과 TSM 값을 관계시킬수 있다.Velocity contours, for example, in the form of data streams, indicate the TSM rate or playback rate required for audio or audio-visual processing for all or several points of processing. In practice, the time resolution required in embodiment 1000 to generate the TSM rate or playback rate required for audio or audio-visual processing is to determine the digital signal comprising audio or audio-visual processing. Fluctuates late compared to the sampling rate. As a result, in a preferred embodiment of the present invention, the velocity contours have a single TSM value associated with a particular group of samples corresponding to a particular segment of audio or audio-visual processing. As an alternative, one can associate the TSM value with each sample of the input audio-visual process.

실제에 있어서, TSM 율 또는 재생 속도를 재생하는데 요구되는 정밀도는 제한된다. 따라서, 본 발명의 바람직한 실시예에 있어서는, 연속적인 TSM 율 또는 재생 속도의 범위를 사용하는 대신, TSM 율은 고정된 간격과 TSM 율을 나타내는데 사용되는 양자화된 레벨의 값으로 양자화된다. 이는 다음과 같이 설명될 수있다.In practice, the precision required to reproduce the TSM rate or playback speed is limited. Thus, in a preferred embodiment of the present invention, instead of using a range of continuous TSM rates or playback rates, the TSM rates are quantized to values of quantized levels used to represent fixed intervals and TSM rates. This can be explained as follows.

2 개의 파라미터가 TSM 모니터 (400) 의 설명된 실시예를 가이드한다.Two parameters guide the described embodiment of the TSM monitor 400.

1. Interval_Size : 이 파라미터는, TSM 율 또는 재생 속도변화의 분석 사이의 경과하여야 하는, 입력 오디오 또는 오디오-비쥬얼 처리의 다수의 샘플로서 주어지는, 시간간격을 결정한다.1. Interval_Size: This parameter determines the time interval, given as a number of samples of input audio or audio-visual processing, that must elapse between analysis of TSM rate or playback rate change.

2. Speed_Change_Resolution : 이 파라미터는, TSM 율 또는 재생 속도를 나타내는데 사용되는, 양자화된 레벨간의 양의 차를 나타낸다.2. Speed_Change_Resolution: This parameter indicates the amount of difference between the quantized levels, which is used to indicate the TSM rate or playback speed.

TSM 모니터 (400) 는 파라미터 Interval_Size 를 사용하여 입력 디지털 스트림을 세그먼트화하며, 입력 디지털 스트림의 각 세그먼트에 대한 단일의 TSM 율, 예컨대 세그먼트의 시작 또는 끝에서의 TSM 율 또는 세그먼트에 걸친 TSM 율의 산술적 평균 등을 결정한다. 각 세그먼트의 길이는 파라미터 Interval_Size 로 주어짐에 유의한다.TSM monitor 400 uses the parameter Interval_Size to segment the input digital stream, and arithmetic of a single TSM rate for each segment of the input digital stream, such as the TSM rate at the beginning or end of the segment or across the segments. Determine the mean, etc. Note that the length of each segment is given by the parameter Interval_Size.

TSM 모니터 (400) 는 변수 Speed_Change_Resolution 을 사용하여, 적절한 TSM 율을 결정하여 TSM 서브시스템 (300) 및 속도 등고선 발생기 (500) 로 전달한다. 사용자에게 요구되는 입력 TSM 율은 기술분야의 당업자에게 공지된 방법으로 양자화된 레벨중 하나로 변환된다. 이것은 출력 TSM 율 또는 재생속도가 요구되는 입력 TSM 율이 양자화 레벨간의 차, 즉 Speed_Change_Resolution 를 초과하는 양으로 변화할 때만 변화할 수 있다는 것을 의미한다. 실제 관계에 있어서는, 파라미터 Speed-Change_Resolution 는 사용자가 TSM 율 또는 재생속도를 작은 량으로 변화시키고, 이후 즉시 이전 값으로 환원시켰을 경우 발생할 수 있는, TSM율 또는 재생 속도의 작은 변화를 필터링한다. 파라미터 Interval_Size 및 Speed_Change_Resolution 은 실시예 (1000) 에 있어서 기술분야에 당업자에게 공지된 방법에 따라 소정의 파라미터로 설정될 수 있으며, 또는 기술 분야의 당업자에게 공지된 방법에 따라 UI (100) 을 통하여 사용자 입력을 수신함으로써 입력 및/또는 변경될 수 있다. 그러나, 이러한 파라미터가 설정 및/또는 변경되는 방법은 본 발명의 이해의 용이를 위해 도시되지 않았다.The TSM monitor 400 uses the variable Speed_Change_Resolution to determine the appropriate TSM rate and pass it to the TSM subsystem 300 and the speed contour generator 500. The input TSM rate required by the user is converted to one of the quantized levels by methods known to those skilled in the art. This means that the output TSM rate or the input TSM rate at which the refresh rate is required can only change when it changes by an amount that exceeds the difference between the quantization levels, that is, Speed_Change_Resolution. In a practical relationship, the parameter Speed-Change_Resolution filters out small changes in the TSM rate or playback speed that may occur if the user changes the TSM rate or playback speed to a small amount and then immediately returns to the previous value. The parameters Interval_Size and Speed_Change_Resolution in the embodiment 1000 may be set to predetermined parameters according to methods known to those skilled in the art, or user input through the UI 100 according to methods known to those skilled in the art. Can be input and / or changed by receiving. However, how these parameters are set and / or changed is not shown for ease of understanding of the present invention.

TSM 모니터 (400) 는 출력으로써 Interval_Size 에 의해 지정된 입력 스트림의 각 세그먼트에 대한 값의 쌍을 만들어낸다. (a) 세그먼트에 대한 입력 디지털 스트림 중의 위치 정보를 나타내는 값의 쌍중 하나 (b) 그 세그먼트에 대하여 사용자에 의해 요청된 TSM 율 또는 재생 속도를 나타내는 값의 쌍중 다른 하나. 이 값의 쌍은 속도 등고선 발생기 (500) 의 입력으로 적용되며, TSM 율을 나타내는 값의 쌍중 다른 하나는 TSM 서브시스템 (300) 의 입력으로 적용된다.TSM monitor 400 produces as output a pair of values for each segment of the input stream specified by Interval_Size. (a) one of a pair of values representing positional information in the input digital stream for the segment; (b) the other of a pair of values representing the TSM rate or playback rate requested by the user for that segment. This pair of values is applied to the input of the velocity contour generator 500 and the other of the pairs of values representing the TSM rate is applied to the input of the TSM subsystem 300.

속도 등고선 발생기 (500) 는 입력으로써 다음을 받아들인다 : (a) TSM 모니터 (400) 로부터 세그먼트에 대한 입력 디저털 스트림중의 위치 정보를 나타내는 값의 쌍중 하나 (b) TSM 모니터 (400) 으로부터 세그먼트에 대한 TSM 율 또는 재생 속도를 나타내는 값의 쌍중 다른 하나, 및 (c) UIP/PC (200) 으로부터 파라미터 Average_or_Overwrite 및 Log_Repeats. 속도 등고선 발생기 (500) 는 데이터베이스 또는 스크랫치-패드 메모리를 사용하여 레코드 리스트를 관리하는데, 각 레코드는 TSM 율 및 TSM 율에 대한 스트림 위치정보에 해당하는 정보를 저장한다. 도 2 는 속도 등고선을 발생시키는 속도 등고선 발생기 (500) 의 일 실시예에 사용되는 알고리즘의 플로차트를 나타낸다. 다음의 필드가 실시예에 사용되는 레코드에 사용된다.The velocity contour generator 500 accepts as input: (a) one of a pair of values representing position information in the input digital stream for the segment from the TSM monitor 400 (b) the segment from the TSM monitor 400. The other one of the pairs of values indicative of the TSM rate or the playback rate, and (c) the parameters Average_or_Overwrite and Log_Repeats from the UIP / PC 200. Speed contour generator 500 manages a list of records using a database or scratch-pad memory, each record storing information corresponding to the TSM rate and stream location information for the TSM rate. 2 shows a flowchart of an algorithm used in one embodiment of a speed contour generator 500 for generating a speed contour. The following fields are used for the records used in the embodiment.

1. Rec : 각 레코드 및 그 할당/발생 순서를 나타내는 고유 번호.1. Rec: A unique number indicating each record and its allocation / occurring order.

2. Loc : 입력 스트림의 세그먼트에 대한 스트림 위치정보를 담고 있는 데이터 필드.2. Loc: A data field containing stream position information for a segment of an input stream.

3. Play_cnt : 세그먼트가 재생되었던 횟수를 담고있는 데이터 필드. Play_cnt 는 레코드가 생성될 때 설정된다.3. Play_cnt: Data field containing the number of times a segment has been played. Play_cnt is set when the record is created.

4. TSM : 세그먼트에 대한 TSM 율을 나타내는 데이터 필드4. TSM: data field indicating the TSM rate for the segment

전술한 데이터 필드에 더하여, 2 개의 파라미터가 속도 등고선을 발생시키는속도 등고선 발생기 (500) 를 가이드한다.In addition to the data fields described above, two parameters guide velocity contour generator 500 which generates velocity contours.

1. Average_or_Overwrite : 이 파라미터는 사용자가 되감기 또는 재생 위치를 수동으로 (예컨대, 마우스, 슬라이더 또는 위치 인디케이터 등으로) 움직여 이전에 재생된 입력 오디오 또는 오디오-비쥬얼 처리가 다시 재생되는 경우 어떻게 정보가 로그 되어야 하는지를 나타낸다. 파라미터의 값이 "평균 (Average)" 로 되었다면, 반복된 세그먼트에 대한 TSM 율 또는 재생 속도는 세그먼트가 재생되었던 때에 각각 지정된 TSM 율 또는 재생속도를 평균함으로써 계산된다. 파라미터 값이 "겹쳐쓰기 (Overwrite)" 라면, 반복된 세그먼트에 대하여 지정된 마지막의 TSM 율 또는 재생 속도만이 속도 등고선의 반복된 세그먼트에 사용된다.1. Average_or_Overwrite: This parameter should be used when the user manually moves the rewind or play position (e.g. with a mouse, slider or position indicator) to replay the previously played input audio or audio-visual processing. Indicates whether it is If the value of the parameter is " Average ", the TSM rate or playback rate for the repeated segment is calculated by averaging the specified TSM rate or playback rate respectively when the segment was played. If the parameter value is "Overwrite", only the last TSM rate or playback speed specified for the repeated segment is used for the repeated segment of the speed contour.

2. Log_Repeats : 이 파라미터는, 참이라면, 속도 등고선 발생기 (500) 가 입력 오디오 또는 오디오-비쥬얼 처리의 일 섹션이 사용자에 의해 재생되는 때 각각의 TSM 율을 레코드 하도록 지시하는 부울 (boolean) 변수이다. TSM 율 또는 재생 속도는 세그먼트가 재생되는 때마다 저장된다.2. Log_Repeats: If true, this parameter is a boolean variable that tells the speed contour generator 500 to record each TSM rate when a section of input audio or audio-visual processing is played by the user. . The TSM rate or playback speed is stored each time a segment is played.

파라미터 Average_or_Overwrite 및 Log_Repeats 는 기술분야의 당업자에게 공지된 방법에 따라 실시예 (1000) 에 대하여 소정의 파라미터로 설정될 수 있으며, 또는 기술분야의 당업자에게 공지된 방법에 따라 UI (100) 을 통해 사용자 입력을 수신함으로써 입력 및/또는 변경될 수 있다. 그러나, 이러한 파라미터가 설정 및/또는 변경되는 방법은 본 발명의 이해의 용이를 위해 도시되지 않았다.The parameters Average_or_Overwrite and Log_Repeats can be set to predetermined parameters for the embodiment 1000 according to methods known to those skilled in the art, or user input via the UI 100 according to methods known to those skilled in the art. Can be input and / or changed by receiving. However, how these parameters are set and / or changed is not shown for ease of understanding of the present invention.

도 2 에 도시된 바와 같이, 세그먼트 위치 및 TSM 율은 박스 (1500) 로의 입력으로 적용된다. 박스 (1500) 에서는, 동일한 세그먼트 위치값을 포함하는 데이터베이스 내에 임의의 레코드를 위치시키도록 검색이 수행된다. 다음 박스 (1510) 로 넘어간다. 박스 (1510) 에서는, 결정이 이루어진다. 동일한 세그먼트 위치 정보를 담고 있는 레코드가 발견되면, 레코드가 통지되고 박스 (1520) 으로 넘어간다. 아무런 레코드가 발견되지 않으면, 박스 (1570) 로 넘어간다.As shown in FIG. 2, the segment location and TSM rate are applied as input to box 1500. In box 1500, a search is performed to locate any record in the database that contains the same segment location value. Proceed to the next box 1510. In box 1510, a determination is made. If a record containing the same segment location information is found, the record is notified and proceeds to box 1520. If no record is found, go to box 1570.

박스 (1570) 에서는, 데이터베이스에 새로운 레코드가 생성되고, 내부 변수 Record_Count 가 업데이트 되어 데이터베이스내의 레코드 카운트를 반영한다 (내부 변수 Record_Count 는 각 새로운 속도 등고선의 발생의 시작에서 0 으로 초기화 된다). 다음, 박스 (1580) 로 넘어간다. 박스 (1580) 에서는, 새로 발생된 레코드의 필드에 데이터 값이 저장되고, 박스 (1550) 로 넘어간다.At box 1570, a new record is created in the database, and the internal variable Record_Count is updated to reflect the record count in the database (internal variable Record_Count is initialized to zero at the beginning of each new velocity contour generation). Next, go to box 1580. At box 1580, the data value is stored in the field of the newly generated record, and the process proceeds to box 1550.

박스 (1520) 에서는 결정이 이루어진다. 파라미터 Log_Repeats 가 참이면, 박스 (1570) 로 넘어가며, 파라미터 Log_Repeats 가 거짓이면, 박스 (1530) 로넘어간다. 박스 (1530) 에서는 결정이 이루어진다. 파라미터 값 Average_or_Overwrite 가 "Average" 와 같다면, 박스 (1540) 으로 넘어간다. 파라미터 Average_or_Overwrite 가 "Overwrite" 와 같다면, 박스 (1560) 로 넘어간다.In box 1520 a determination is made. If the parameter Log_Repeats is true, go to box 1570; if the parameter Log_Repeats is false, go to box 1530. In box 1530 a decision is made. If the parameter value Average_or_Overwrite is equal to "Average", go to box 1540. If the parameter Average_or_Overwrite is equal to "Overwrite", go to box 1560.

박스 (1540) 에서는, 필드 TSM 및 Play_cnt 의 데이터가 교체된다. 도 2 에 도시된 바와 같이, Play_Cnt 의 이전값은 TSM 율의 산술적 평균을 계산하는데 사용되며, Play_Cnt 는 증분된다. 다음, 박스 (1550) 으로 넘어간다. 박스 (1560) 에서는, 필드 TSM 및 Play_Cnt 의 데이터가 교체된다. 도 2 에 도시된 바와 같이, 현재의 TSM 율은 이전의 TSM 율을 겹쳐쓰며 Play_Cnt 는 증분된다. 다음 박스 (1550) 으로 넘어간다.In box 1540, the data in fields TSM and Play_cnt are replaced. As shown in FIG. 2, the previous value of Play_Cnt is used to calculate the arithmetic mean of the TSM rate, and Play_Cnt is incremented. Next, go to box 1550. In box 1560, data of fields TSM and Play_Cnt are replaced. As shown in Figure 2, the current TSM rate overwrites the previous TSM rate and Play_Cnt is incremented. Proceed to the next box 1550.

박스 (1550) 에서는, 새로 발생된 또는 변형된 레코드가 데이터베이스에 저장된다. 새로운 데이터 값이 속도 등고선 (500) 에 도달할 때까지 대기상태가 되며, 박스 (1500) 로 넘어간다. 오디오 또는 오디오-비쥬얼 처리의 재생완료에 따라, 데이터베이스가 검색되고, 입력신호의 각 세그먼트에 대한 TSM 율 또는 재생 속도가 추출되어, 속도 등고선을 구축하는데 사용된다. 아무런 세그먼트도 반복되지 않고 처리가 전체 그대로 재생되는 때에는, Rec 데이터 필드에 저장된 해당 할당 순서에 따라 올림 순으로 데이터베이스 레코드를 소팅 (sorting) 함으로써 속도 등고선이 얻어짐에 주목한다. 또한, 기술분야의 당업자에게 공지된 많은 방법 중 하나에 따라 나중에 사용하기 위해 속도 등고선이 저장되어, 그러한 디지털 데이터 스트림을 저장하도록 될 수 있음에 주목한다. 예컨대, 속도 등고선은 디지털 저장장치 (75) 또는 몇몇 다른 저장 매체에 저장될 수 있으며, 또는 모뎀과 같은 전송 장치를 통해 다른 시스템으로 전송될 수 있다.In box 1550, the newly generated or modified record is stored in a database. It waits until the new data value reaches the speed contour 500, and proceeds to box 1500. Upon completion of the playback of the audio or audio-visual process, a database is searched and the TSM rate or playback speed for each segment of the input signal is extracted and used to construct the speed contour. Note that when no segment is repeated and the process is reproduced as it is, the speed contour is obtained by sorting the database records in ascending order according to the corresponding allocation order stored in the Rec data field. It is also noted that the speed contours can be stored for later use in accordance with one of many methods known to those skilled in the art to store such digital data streams. For example, speed contours may be stored in digital storage 75 or some other storage medium, or may be transmitted to another system via a transmission device, such as a modem.

도 1 은 바람직한 실시예에서 별도의 모듈이 포함되어야 하는 실시예 (1000) 를 나타내지만, 바람직한 실시예에서는 UI (100), UIP/PC (200), TSM 서브시스템 (300), TSM 모니터 (400), 및 속도 등고선 발생기 (500) 가 소프트웨어 프로그램 또는 예컨대 PC 와 같은 범용 컴퓨터상에서 실행되는 모듈로 구현될 수 있다. 또한, 디지털 저장장치 (75) 는 디스크 드라이브 또는 RAM 으로 구현될 수 있으며, 디지털-아날로그 변환기 (600) 는 PC 상의 사운드카드와 같은 범용 컴퓨터로의 전형적인 악세서리로 구현된다. 전술한 상세한 설명에서, 이러한 소프트웨어의 프로그램 또는 모듈이 구현되는 방법은 당업자에게 공지되어 있음은 잘 알려져 있다.1 illustrates an embodiment 1000 in which a separate module should be included in the preferred embodiment, while in the preferred embodiment the UI 100, UIP / PC 200, TSM subsystem 300, TSM monitor 400 And speed contour generator 500 may be implemented as a software program or as a module running on a general purpose computer such as a PC. Further, digital storage 75 can be implemented as a disk drive or RAM, and digital-to-analog converter 600 is implemented as a typical accessory to a general purpose computer such as a sound card on a PC. In the foregoing detailed description, it is well known that methods of implementing programs or modules of such software are known to those skilled in the art.

본 발명의 일실시예에 따르면, 특정 사용자에 대한 속도 등고선에서 주어지는 데이터는 도해적 형태로 나타내어져, 사용자 또는 사용자 그룹에 의해 선택되는 TSM 율 또는 재생 속도를 디스플레이 하여 유사성 혹은 차이점을 확인할 수 있도록 할 수 있다. 일 실시예에서, TSM 율은 2 차원 그래프의 수직축상에 디스플레이되고, 세그먼트 번호 또는 시간값은 수평축상에 디스플레이 된다. 도 3 은 동일한 오디오 또는 오디오-비쥬얼 처리의 몇몇 다른 청취 세션에 대한 속도 등고선을 도해적 형태로 나타낸다. 이러한 속도 등고선을 도해적 형태로 디스플레이 함으로써, 사용자 관심도, 사용자 이해도, 및 사용자 혼동도에 대한 정보가 추정될 수 있다. 예를 들면, 모든 사용자가 오디오 또는 오디오-비쥬얼 처리의 TSM 율또는 재생속도를 세그먼트 (1000) (도 3 의 A) 에서 감속시키고, 다음 세그먼트 (2200) (도 3 의 B) 에서 TSM 율 또는 재생 속도를 가속시켰다. 이것으로부터 사용자는 세그먼트 1000 과 2200 사이 주기에 제공되는 자료에 더욱 관심이 있음을, 또는 앞의 세그먼트에 대한 TSM 율 또는 재생 속도가 그 주기의 해당 작업의 안전하고 완전한 이해에 비해서는 너무 빠르게 자료의 복잡도가 변화하였다고 추정할 수 있다. 전술한 실시예 (1000) 에 따라서 저장된 속도 등고선의 도해적 디스플레이를 제공하는 방법과, 관련된 인증정보와 함께 몇몇 사용자 및/또는 동일한 사용자의 몇몇 세션에 대한 상기 속도 등고선을 저장하여 기술분야의 공지된 방법에 따라 저장된 속도 등고선중 특정한 하나에 관련된 정보를 검색할 수 있도록 하는 방법은 기술분야의 당업자에게 공지되어 있다.According to one embodiment of the invention, the data given in the velocity contours for a particular user are presented in a graphical form to display similarity or differences by displaying the TSM rate or playback rate selected by the user or user group. Can be. In one embodiment, the TSM rate is displayed on the vertical axis of the two-dimensional graph, and the segment number or time value is displayed on the horizontal axis. 3 shows the speed contours in graphical form for several different listening sessions of the same audio or audio-visual process. By displaying these velocity contours in a graphical form, information about user interest, user understanding, and user confusion can be estimated. For example, all users slow down the TSM rate or playback speed of audio or audio-visual processing in segment 1000 (A in FIG. 3), and in the next segment 2200 (FIG. 3B) Accelerated speed From this, the user is more interested in the data provided in the cycle between segments 1000 and 2200, or the TSM rate or regeneration rate for the preceding segment is too fast for the safe and complete understanding of the task in that cycle. It can be estimated that the complexity has changed. A method for providing a graphical display of stored speed contours in accordance with embodiment 1000 described above, and storing said speed contours for several users and / or several sessions of the same user together with associated authentication information, as known in the art. Methods that allow retrieval of information related to a particular one of stored velocity contours in accordance with the method are known to those skilled in the art.

본 발명의 대체 실시예에서는 속도 등고선 발생기 (500) 를 제외하고는 실시예 (1000) 과 연계하여 전술한 것과 동일한 성분을 갖는다. 본 발명의 대체 실시예에서는, 속도 등고선 발생기 (5000) 가 입력 오디오 또는 오디오-비쥬얼 처리의 각 세그먼트에 대한 TSM 율 또는 재생 속도의 도함수를 포함하는 "도함수" 속도 등고선을 출력한다. 도 4 는 동일한 오디오 또는 오디오-비쥬얼 처리의 몇몇 다른 청취 세션에 대하여 사용자에 의해 지정되는 TSM 율 또는 재생속도의 제 1 도함수를 사용하여 만들어진 속도 등고선의 도해적 표현을 나타낸다. 도 4 의 2 차원 도해에서, TSM 율의 제 1 도함수는 수직축상에 디스플레이 되고, 시간은 수평축상에 디스플레이 된다. 도 3 에 디스플레이된 동일한 데이터는 각 사용자에 대한 도함수 속도 등고선 을 발생시키는데 사용된다. 도 4 에서 볼 수 있듯이,유도된 속도 등고선은 다소 관찰이 쉬운 발음 방법으로 사용자에 의해 요청되는 TSM 율 또는 재생 속도의 변화를 나타낸다. 또한, 도함수 속도 등고선은, 비교적 적은 TSM 율 또는 재생속도 변화가 있고, non-zero 의 도함수 TSM 율에 관련된 세그먼트만이 저장되어야 하기 때문에, 속도 등고선보다 적은 데이터를 포함한다는 것을 쉽게 이해할 수 있다. 도 2 에 도시된 알고리즘을 변환하여 기술분야의 당업자에게 공지된 방법을 사용하여 도함수 속도 등고선을 결정하는 방법 또는 속도 등고선으로부터 도함수 속도 등고선을 유도하는 방법은 기술분야의 당업자에게는 자명한 것일 것이다.Alternative embodiments of the present invention have the same components as described above in connection with embodiment 1000 except for speed contour generator 500. In an alternative embodiment of the present invention, velocity contour generator 5000 outputs a "derivative" velocity contour that includes a derivative of the TSM rate or playback rate for each segment of the input audio or audio-visual process. 4 shows a graphical representation of a velocity contour created using a first derivative of the TSM rate or playback rate specified by the user for several different listening sessions of the same audio or audio-visual process. In the two-dimensional diagram of FIG. 4, the first derivative of the TSM rate is displayed on the vertical axis and the time is displayed on the horizontal axis. The same data displayed in Figure 3 is used to generate the derivative velocity contours for each user. As can be seen in Figure 4, the induced speed contours represent changes in the TSM rate or playback speed requested by the user in a somewhat easy-to-observe pronunciation method. It is also readily understood that derivative velocity contours contain less data than velocity contours because there are relatively few TSM rates or regeneration rate changes, and only segments associated with non-zero derivative TSM rates should be stored. It will be apparent to those skilled in the art how to transform the algorithm shown in FIG. 2 to determine the derivative velocity contours using methods known to those skilled in the art or to derive the derivative velocity contours from the velocity contours.

평균 속도 등고선이란 용어는, 특정 사용자가 오디오 또는 오디오-비쥬얼 구절들을 몇 회 청취할 때, 본 발명의 실시예, 예컨대 전술한 실시예 (1000) 의 사용으로 발생되는 몇몇 속도 등고선들을 평균화함으로써, 특정 오디오 또는 오디오-비쥬얼 처리에 대하여 획득되는 속도 등고선을 말한다. 평균 속도 등고선의 특정 세그먼트에 대한 TSM 율 또는 재생 속도의 값은, 해당 오디오 또는 오디오-비쥬얼 처리의 세그먼트에 대한 몇몇 속도 등고선들 각각에서 TSM 율 또는 재생 속도의 산술적 평균을 연산함으로써 얻어진다. 기술분야의 당업자에게 공지된 방법에 따라, 관련된 인증 정보와 함께 몇몇 사용자 및/또는 동일한 사용자의 몇몇 세션들에 대하여 전술한 실시예 (1000) 에 따라 발생되는 속도 등고선을 저장하고, 저장된 속도 등고선들중 특정의 등고선에 관련되는 정보의 검색을 제공하는 방법은 기술분야의 당업자에게는 자명할 것이다. 또한, 저장된 속도 등고선들중 임의의 수로부터 평균 속도 등고선을 연산하는 방법은 기술분야의 당업자에게는 자명할 것이다. 평균 속도 등고선의 한 사용예로서는, 예컨대 전화번호 등의 이에 해당하는 정보가 청취자에 의해 받아쓰기 (transcribe) 될, 광고용의 또는 정보용의 오디오 또는 오디오-비쥬얼 처리를 만드는 것이다. 청취자가 요구되는 정보를 성공적으로 받아적을 수 있도록 하는 최적의 정보전송속도를 결정하기 위해서는, 대상 청취자중 대표 사용자에 의해 생성되는 속도 등고선을 사용하여 평균 속도 등고선을 발생시키는 것이다. 평균 속도 등고선의 또 다른 사용예로서는, 최대전송속도가 청취자가 전송되는 정보를 이해할 수 있도록 해줄, 오디오 또는 오디오-비쥬얼 처리에 대한 최대 전송속도에서 정보를 제공하도록 하는것이다. 예컨대, 그렇게 만들어지는 광고로는 말하기 속도가 빠르고, 혹은 정보전달속도가 빨라, 주어진 타임 슬롯에서 가능한 많은 정보를 전달할 것이다. 본 발명의 실시예를 사용하는 청취자는 말하기 속도가 청취자의 이해정도 또는 인지정도 보다 너무 빠른 오디오 또는 오디오-비쥬얼 처리의 세그먼트에 대하여 TSM 율을 감소시킬 수 있을 것이다.The term average velocity contours is used when a particular user listens to audio or audio-visual passages several times, thereby averaging some velocity contours resulting from the use of an embodiment of the present invention, such as the embodiment 1000 described above. Refers to velocity contours obtained for audio or audio-visual processing. The value of the TSM rate or playback speed for a particular segment of the average speed contour is obtained by calculating the arithmetic mean of the TSM rate or playback speed in each of the several speed contours for that segment of the audio or audio-visual process. According to methods known to those skilled in the art, the speed contours generated in accordance with the embodiment 1000 described above for several users and / or several sessions of the same user with associated authentication information, and stored speed contours It will be apparent to those skilled in the art how to provide retrieval of information related to a particular contour. Also, it will be apparent to those skilled in the art how to calculate the average velocity contour from any number of stored velocity contours. One use of the average speed contour is to create an advertising or informational audio or audio-visual process for which corresponding information, such as a telephone number, for example, will be dictated by the listener. In order to determine the optimal information transmission speed that enables the listener to successfully capture the required information, the average speed contour is generated using the speed contours generated by the representative user among the target listeners. Another use of the average rate contour is to provide information at the maximum rate for audio or audio-visual processing, which allows the maximum rate to allow the listener to understand the information being transmitted. For example, an advertisement that is created in this way may have a faster speaking rate, or a faster information delivery rate, to convey as much information as possible in a given time slot. Listeners using embodiments of the present invention may be able to reduce the TSM rate for segments of audio or audio-visual processing where speaking speeds are too fast for the listener's understanding or perception.

대중적 속도등고선 (Democratic Speed Contour) 이란 용어는, 특정의 오디오 또는 오디오-비쥬얼 처리에 대하여, 특정 오디오 또는 오디오-비쥬얼 작업을 청취하는 동안 다른 사용자들로부터 얻어지는 몇몇 평균 속도 등고선 또는 몇몇 속도 등고선들을 평균함으로써 얻어지는 속도 등고선을 말한다. 대중적 속도등고선의 특정 세그먼트에 대한 TSM 율 또는 재생 속도의 값은, 오디오 또는 오디오-비쥬얼 처리의 해당 세그먼트에 대하여 몇몇 속도 등고선 각각의 (예컨대, 상이한 각 청취자로부터의 ) TSM 율 또는 재생속도의 산술적 평균을 구함으로써 얻어진다.기술분야의 당업자에게 공지된 방법에 따라, 관련된 인증정보와 함께 몇몇 사용자에 대하여 및/또는 동일한 사용자의 몇몇 세션에 대하여 전술한 실시예 (1000) 에 따라 발생되는 속도 등고선을 저장하여, 저장된 속도 등고선의 특정 등고선에 관련된 정보의 검색을 제공하는 방법은 기술분야의 당업자에게 자명할 것이다. 대중적 속도 등고선의 한 사용예로서는 정보를 전송하는 사람들에 의하는 것이다. 특정 인구통계적 청취자 그룹이 정보를 잘 활용할 수 있도록 하는 최적의 정보 전송속도를 결정하기 위해서, 청취자의 특정 인구통계학적 그룹의 구성원들로부터 발생되는 속도 등고선을 사용하여 인구통계적 등고선을 발생시킬 수 있다. 예컨대, 국가의 한 지역의 청취자가 국가의 다른 지역의 액센트를 갖는 화자를 청취할 때 좀 더 느린 정보전송속도를 요구한다는 사실을 이용하는 대중적 등고선을 제공하는 데 본 실시예가 사용될 수 있다. 대중적 등고선의 또 다른 사용예에서는, 예컨대 앙케이트에 의해 특정 인구통계학적 청취자 그룹에 대한 정보가 획득된다. 다음, 앙케이트의 응답에 기초하여 타겟 청중이 선택된다. 예컨대, 일개의 그룹이 PC 를 사용하는 청취자 서브그룹과 PC 를 사용하지 않는 청취자 서브그룹으로 분류될 수 있다. 다음, 각 서브 그룹에 의해 발생되는 대중적 속도 등고선으로부터 컴퓨터 소프트웨어 제품에 관한 최적의 속도 전송속도가 얻어진다. 이러한 방법에서, 특정의 인구통계적 청취자 그룹에 대하여 광고용 또는 정보용의 오디오 또는 오디오-비쥬얼 처리의 최적의 정보전송속도가 얻어질 수 있다.The term Democratic Speed Contour refers to a particular audio or audio-visual process by averaging some average speed contours or some speed contours obtained from other users while listening to a particular audio or audio-visual task. Refers to the velocity contours obtained. The value of the TSM rate or playback rate for a particular segment of the popular speed contour is the arithmetic mean of the TSM rate or playback rate of each of the speed contours (eg from each different listener) for that segment of the audio or audio-visual processing. According to methods known to those skilled in the art, the velocity contours generated according to embodiment 1000 described above for some users and / or for several sessions of the same user with associated authentication information. It will be apparent to those skilled in the art how to store and provide retrieval of information related to a particular contour of a stored velocity contour. One use of popular speed contours is by those who transmit information. In order to determine the optimal rate of information transfer that allows a particular demographic group of listeners to make good use of the information, demographic contours can be generated using velocity contours from members of a particular demographic group of listeners. For example, this embodiment may be used to provide a public contour that takes advantage of the fact that when a listener in one region of the country listens to a speaker having an accent in another region of the country, a slower information transfer rate is required. In another example of popular contouring, information about a particular demographic listener group is obtained, for example, by a questionnaire. Next, the target audience is selected based on the questionnaire's response. For example, one group may be classified into a listener subgroup using a PC and a listener subgroup not using a PC. Next, the optimum speed transfer rate for the computer software product is obtained from the popular speed contours generated by each subgroup. In this way, optimal information rates of audio or audio-visual processing for advertising or information can be obtained for a particular demographic listener group.

도 5 는 오디오 또는 오디오-비쥬얼 처리에 대한 속도 등고선을 발생시키는 본 발명의 제 2 실시예 (2000) 의 블록도를 나타내는 것으로서, 오디오 또는 오디오-비쥬얼 처리의 사용자 입력 및 워드 맵이 속도 등고선을 제공하도록 사용된다. 그러한 실시예에서는, 사용자가 오디오-비쥬얼 처리의 오디오 또는 오디오 부분을 청취하지 않고도 속도 등고선이 발생될 수 있다. 본 발명의 제 2 실시예에 따르면, 본 발명의 제 1 실시예와 연계하여 전술한 샘플링 TSM 율 또는 재생 속도가 아니라, 사용자 입력에 응답하여 속도 등고선을 디스플레이하고 조작하는 에디터를 사용하여 속도 등고선이 얻어진다.5 shows a block diagram of a second embodiment 2000 of the invention for generating velocity contours for audio or audio-visual processing, wherein user input and word maps of audio or audio-visual processing provide velocity contours. It is used to In such embodiments, velocity contours may be generated without the user listening to the audio or audio portion of the audio-visual process. According to the second embodiment of the present invention, the speed contours are not displayed using an editor for displaying and manipulating the speed contours in response to user input, rather than the sampling TSM rate or playback speed described above in connection with the first embodiment of the present invention. Obtained.

도 5 에 도시된 바와 같이, 실시예 (2000) 는 사용자로부터 입력을 수신하는 사용자 인터페이스 (2100) (UI) 를 구비한다. 사용자로부터 입력을 수신하는 기술분야의 당업자에게 공지된 수많은 장치가 있다. 예컨대, (a) 키 누름, (b) 마우스의 스위치의 작동 (c) 슬라이더 또는 위치 인디케이터의 이동 및 (d) 사용자 음성 명령 등을 검출하고, 응답하여 키 누름, 스위치 작동, 슬라이더 또는 위치 인디케이터의 이동, 또는 소리 명령을 나타내는 디지털 정보를 중앙처리장치로 송출하는 상용의 장비가 있음은 기술분야의 당업자에게는 자명한 것이다.As shown in FIG. 5, an embodiment 2000 has a user interface 2100 (UI) for receiving input from a user. There are a number of devices known to those skilled in the art for receiving input from a user. For example, (a) pressing a key, (b) operating a switch on a mouse, (c) moving a slider or position indicator, and (d) a user voice command, and responding to a key press, switch operation, slider or position indicator. It is apparent to those skilled in the art that there are commercially available equipment that sends digital information indicative of movement or sound commands to a central processing unit.

도 5 에 도시된 바와 같이, 실시예 (2000) 는 UI (2100) 로부터 사용자 입력 또는 디지털 저장장치 (2075) 에 저장된 입력 오디오 또는 오디오-비쥬얼 처리로부터의 데이터 또는 신호을 수신하는 사용자 입력 처리기 (2200) (UIP) 를 구비한다. 응답하여, UIP (2200) 는 데이터를 발생시키고 출력하여, 예컨대 (a) 수평축상에 디스플레이되는 시간 및 가능하다면 텍스트 또는 음성학적 단어, (b) 수직축상에 디스플레이되는 TSM 율을 갖는, 2 차원의 그래프를 생성한다. 그래프 디스플레이 (2300) 는 입력으로 UIP (2200) 로부터 그래프 스크린 디스플레이 이미지를 제공하는 데이터를 수신한다. 응답하여, 그래프 디스플레이 (2300) 는 택스트 또는 음성학적 라벨 (label) 과 함께 입력 오디오 또는 오디오-비쥬얼 처리의 2 차원 표현을 디스플레이 한다. 예컨대, 컴퓨터 스크린상에 소리 파형의 그래프 표현의 상단부에 오버레이 (overlay) 로서 텍스트 및/또는 음성학 (phonetic) 적 정보가 디스플레이 될 수 있음은 기술분야의 당업자에게는 자명한 것이다. 다음, 실시예 (2000) 에 따라, 사용자는 예컨대 UIP (2100) 의 제어하의 커서를 사용하여 그래프 디스플레이 (2300) 상에 디스플레이된 텍스트의 영역을 하이라이트시켜, 하이라이트된 택스트와 관련된 입력 오디오 또는 오디오-비쥬얼 처리의 특정 부분들을 식별할 수 있다. 다음, 기술분야의 당업자에게 공지된 방법으로 UI (2100) 를 사용하여, 사용자는 하이라이트 된 택스트에 관련된 입력 오디오 또는 오디오-비쥬얼 처리의 특정 부분에 대한 TSM 율 또는 재생 속도를 선택 및/또는 지정한다. 본 발명의 제 2 실시예의 또 다른 예로서는, UIP (2200) 는 오디오 처리 또는 오디오-비쥬얼 처리의 오디오 부분의 택스트화 된 부분을 디스플레이하는 텍스트 에디터를 구비한다. 응답하여, UI (2100) 를 기술분야의 당업자에게 공지된 방법으로 사용하여, 사용자는 택스트 영역을 선택하고, 선택된 택스트 영역에 대한 TSM 율 또는 재생 속도를 선택 및/또는 지정한다. 다음, 선택된 택스트 영역의 경계에 해당하는 입력 오디오 또는 오디오-비쥬얼 처리의 샘플 또는 세그먼트가 결정되고, 속도 등고선을 구축하는데 사용된다. 도 6 은, 소리 파형과 대응하는 오디오 또는 오디오-비쥬얼 처리의 택스트를 디스플레이하는 2 차원 그래프를 도해적 형태로 나타낸다. 도 6 에 도시된 바와 같이, 사용자는 전화번호를 담고 있는입력 오디오 또는 오디오-비쥬얼 처리의 영역 (6100) 을 하이라이트하였다. 다음 사용자는 슬라이더바 (6200) 를 사용하여 입력 오디오 또는 오디오-비쥬얼 처리의 선택된 영역에 대해 요구되는 TSM 율을 나타내었다. 마지막으로, 도 6 은 사용자에 의해 요청된 TSM 율에 기초하여 발생되는 속도 등고선 (6300) 을 나타낸다. 도 7 은 오디오 또는 오디오-비쥬얼 처리의 택스트화된 디스플레이를 나타낸다. 도 7 에 도시된 바와 같이, 사용자는 전화번호를 담고 있는 입력 오디오 또는 오디오-비쥬얼 처리의 택스화된 영역 (7100) 을 하이라이트 하였다.As shown in FIG. 5, an embodiment 2000 includes a user input processor 2200 that receives data or signals from user input or input audio or audio-visual processing stored in the digital storage 2075 from the UI 2100. (UIP) is provided. In response, UIP 2200 generates and outputs data, such as two-dimensional, with (a) time displayed on the horizontal axis and possibly text or phonetic words, and (b) TSM rate displayed on the vertical axis. Create a graph. The graph display 2300 receives as input, data providing a graph screen display image from the UIP 2200. In response, graph display 2300 displays a two-dimensional representation of the input audio or audio-visual process with a text or phonetic label. It will be apparent to one skilled in the art, for example, that text and / or phonetic information can be displayed as an overlay on top of a graphical representation of a sound waveform on a computer screen. Next, in accordance with embodiment 2000, the user highlights an area of text displayed on graph display 2300 using, for example, a cursor under control of UIP 2100, such that the input audio or audio- associated with the highlighted text is selected. Specific portions of visual processing can be identified. Next, using the UI 2100 in a manner known to those skilled in the art, the user selects and / or specifies the TSM rate or playback speed for a particular portion of the input audio or audio-visual processing related to the highlighted text. . As another example of the second embodiment of the present invention, the UIP 2200 has a text editor that displays a texturized portion of the audio portion of the audio processing or the audio-visual processing. In response, using the UI 2100 in a manner known to those skilled in the art, the user selects a text area, and selects and / or specifies a TSM rate or playback speed for the selected text area. Next, a sample or segment of the input audio or audio-visual processing corresponding to the boundary of the selected text area is determined and used to construct the velocity contours. 6 shows in graphical form a two-dimensional graph displaying text and corresponding text of audio or audio-visual processing. As shown in FIG. 6, the user has highlighted the area 6100 of the input audio or audio-visual process containing the telephone number. The user then used the slider bar 6200 to indicate the required TSM rate for the selected area of the input audio or audio-visual processing. Finally, FIG. 6 shows velocity contour 6300 generated based on the TSM rate requested by the user. 7 shows a texturized display of audio or audio-visual processing. As shown in FIG. 7, the user has highlighted a taxonized area 7100 of the input audio or audio-visual process containing the telephone number.

UIP (2200) 는 (도 2 와 연계하여) 속도 등고선 (500) 에 대하여 전술한 동일한 방법으로 속도 등고선을 구축한다. 마지막으로, UIP (2200) 는 예컨대 디지털 저장장치 (2075) 또는 다른 저장 매체상에 속도 등고선을 저장하거나, 속도 등고선을 모뎀과 같은 전송장비를 통하여 다른 시스템으로 전송한다.The UIP 2200 builds the velocity contours in the same manner described above with respect to the velocity contours 500 (in conjunction with FIG. 2). Finally, the UIP 2200 stores the speed contours on, for example, digital storage 2075 or other storage media, or transmits the speed contours to another system via a transmission device, such as a modem.

도 5 는 별도의 모듈을 구비해야하는 실시예 (1000) 를 나타내지만, 바람직한 실시예로서는, UI (2100) 및 UIP (2200) 는 예컨대 PC 와 같은 범용 컴퓨터상에 실행되는 소프트웨어 프로그램 또는 모듈로 구현된다. 또한, 디지털 저장장치 (2075) 는 디스크 드라이브 또는 RAM 으로 구현된다. 전술한 상세한 설명에서 조명된 바와 같이, 이러한 프로그램 또는 모듈을 소프트웨어로 구현하는 방법은 기술분야의 당업자에게는 자명한 것이다. 또한, 기술분야의 당업자에게 공지된 많은 방법에 따라, 오디오 또는 오디오 비쥬얼 도구는 아날로그 형태로 디지털 저장장치 (2075) 에 저장되고, 디지털 형태로 변환될 수도 있다.Although FIG. 5 illustrates an embodiment 1000 that should have a separate module, in a preferred embodiment, the UI 2100 and UIP 2200 are implemented as software programs or modules that run on a general purpose computer such as a PC, for example. In addition, the digital storage device 2075 is implemented as a disk drive or RAM. As is evident from the foregoing detailed description, methods of implementing such programs or modules in software will be apparent to those skilled in the art. In addition, according to many methods known to those skilled in the art, audio or audio visual tools may be stored in digital storage 2075 in analog form and converted to digital form.

전술한 본 발명의 제 2 실시예에 따르면, 속도 등고선은 자연적으로 시간적이다. 즉, TSM 율 또는 재생 속도는 오디오 또는 오디오-비쥬얼 처리의 매 시간간격과 관련된다. 시간 등고선의 이러한 특성은 청취자 또는 에디터에 의한 오디오 또는 오디오-비쥬얼 처리의 몇몇 종류의 시연 (preview) 이 처리에 대한 속도 등고선을 결정할 것을 요구한다. 이것을 제거하기 위하여, 본 발명의 제 3 실시예에서는, 컨셉 속도관계 자료구조 (Conceptual Speed Association, CSA 자료구조) 가 LIF 도구를 생성하는데 사용하기 위하여 발생된다. CSA 자료구조는, 예컨대 일련의 컨셉 식별자의 리스트 쌍 및 속도 값 식별자의 리스트이다. CSA 자료구조는 이러한 서브-리스트 쌍들의 리스트로 저장된다.According to the second embodiment of the invention described above, the velocity contours are naturally temporal. That is, the TSM rate or playback speed is related to every time interval of audio or audio-visual processing. This property of temporal contours requires some kind of preview of audio or audio-visual processing by the listener or editor to determine the speed contours for the processing. To eliminate this, in a third embodiment of the present invention, a conceptual speed association data structure (CSA data structure) is generated for use in generating the LIF tool. A CSA data structure is, for example, a list of list pairs of concept identifiers and a list of velocity value identifiers. The CSA data structure is stored as a list of these sub-list pairs.

컨셉 식별자는 "증권시장", "월 스트리트" 및 "파이낸셜" 과 같은 컨셉을 표현하는 키워드, 단어열, 또는 구절을 포함한다. 이러한 컨셉 식별자는 컨셉 식별자를 포함하는 오디오 또는 오디오-비쥬얼 처리를 청취하는 동안 사용자에의해 요구되는 TSM 율, 또는 재생 속도를 나타내는 속도값 식별자와 쌍을 이룬다.Concept identifiers include keywords, word strings, or phrases that represent concepts such as "Stock Market," "Wall Street," and "Financial." This concept identifier is paired with a speed value identifier that indicates the TSM rate, or playback speed, required by the user while listening to audio or audio-visual processing including the concept identifier.

본 발명의 제 3 실시예는, 오디오 또는 오디오-비쥬얼 처리의 특정 부분의 컨셉 정보를 검출하는 검출 장치와 컨셉 정보를 사용하여 TSM 율 또는 재생 속도, CSA 자료구조로부터의 정보를 검색하여, 검색된 정보가 특정 부분에 사용되어야할 TSM 율 또는 재생 속도를 결정하는데 사용되게 하는, 검색 장치를 활용한다. 본 발명의 일실시예에 따르면, 검출 장치는 기술분야의 당업자에게 공지된 음성 인식장비를 포함한다. 본 발명의 다른 실시예에 따르면, 예컨대 많은 TV 방송에서 사용하는 또는 비디오 테잎에서 활용되는 폐쇄형 캡션 정보 (closed captioning information) 내에 포함된 컨셉 정보를 검출하는 장치를 포함한다. 그러한 폐쇄형 캡션 정보를 검출하는 검출 장치는 기술분야의 당업자에게 공지되어 있다.A third embodiment of the present invention uses a detection device that detects concept information of a specific portion of an audio or audio-visual process and concept information to retrieve information from a TSM rate or playback speed, CSA data structure, and retrieved information. Utilize a search device, which allows the to be used to determine the TSM rate or playback speed that should be used for a particular portion. According to one embodiment of the invention, the detection device comprises a speech recognition equipment known to those skilled in the art. According to another embodiment of the present invention, there is included an apparatus for detecting concept information included in closed captioning information, for example, used in many TV broadcasts or utilized in videotape. Detection devices for detecting such closed caption information are known to those skilled in the art.

도 8 은 오디오 또는 오디오-비쥬얼 처리에 대한 CSA 자료구조를 발생시키는 본 발명의 제 3 실시예 (4000) 의 블록도이다. 도 8 에 도시된 바와 같이, 실시예 (4000) 는 사용자로부터 입력을 수신하는 사용자 인터페이스 (UI, 4100) 를 구비한다. UI (4100) 의 실시예는 도 1 에 대하여 전술한 UI (100) 와 동일한 것이다. UI (4100) 는 사용자로부터의 입력을 나타내는 출력신호를 제공한다. 사용자 입력은 사용자입력처리기/재생제어기 (UIP/PC, 4200) 에 의해 인터프리트되어 사용자에 의해 선택되는 다음의 옵션을 나타낸다 : (a) 재생되어야 하는 특정 오디오 또는 오디오-비쥬얼 처리에 대응하는 파일을 선택 (선택된 파일은 실시예 (4000) 로 직접적으로 입력될 수도 있고, 실시예 (4000) 에 의해 저장되었던 파일일 수도 있다), (b) 선택된 파일의 재생을 시작, (c) 선택된 파일의 재생을 홀트시킴, (d) 선택된 파일의 재생을 일시중지, (e) 재생되고 있는 오디오 또는 오디오-비쥬얼 처리의 일 부분의 TSM 율 또는 재생 속도를 변경, 또는 (f) CSA 자료구조를 발생시키는데 대하여 이하 상술되는 방법으로 장치에 의해 사용되는 파라미터 Refine_or_Average, Theta, 및 Sigma 를 지정.8 is a block diagram of a third embodiment 4000 of the present invention for generating CSA data structures for audio or audio-visual processing. As shown in FIG. 8, an embodiment 4000 has a user interface (UI) 4100 for receiving input from a user. An embodiment of the UI 4100 is the same as the UI 100 described above with respect to FIG. 1. The UI 4100 provides an output signal indicative of an input from a user. The user input is interpreted by the user input processor / playback controller (UIP / PC) 4200 to indicate the following options selected by the user: (a) File corresponding to the particular audio or audio-visual process to be played. Selection (the selected file may be entered directly into the embodiment 4000, or may be a file that was stored by the embodiment 4000), (b) starts playback of the selected file, (c) plays the selected file (D) pause playback of the selected file, (e) change the TSM rate or playback speed of a portion of the audio or audio-visual processing being played, or (f) generate a CSA data structure. Specify the parameters Refine_or_Average, Theta, and Sigma to be used by the device in the manner described below.

UIP/PC (4200) 는 UI (4100) 로부터 입력을 수신하고, (a) 사용자 입력을 수치값으로 변환하고, (b) 사용자 입력을 인터프리트 하여, 파라미터 값을 설정하고, CSA 자료구조의 발생, 사용, 변형, 또는 오버라이딩을 제어하고, (c) 디지털 저장장치 (4075) 로 스트림 데이터 요청을 송출함으로써 오디오 또는 오디오-비쥬얼 처리로부터의 데이터 스트림의 액세스와 로딩을 지시한다 (재생 제어를 수행한다). 디지털 저장장치 (4075) 의 경우에 있어서는, UIP/PC (4200) 가 장치상의 파일 시스템으로 저장된 오디오 또는 오디오-비쥬얼 처리를 나타내는 디지털 데이터 파일을 액세스 하도록 요청할 수 있다. 오디오 또는 오디오-비쥬얼 처리로부터의 데이터 스트림의 액세스와 로딩을 지시하기 위해서, UIP/PC (4200) 는 디지털 저장장치 (4075) 상에 저장된 오디오 또는 오디오-비쥬얼 처리를 나타내는 디지털 샘플의 위치와 사용자 입력을 인터프리트하여, 특정 샘플에서 선택된 파일에 대한 재생위치를 연산한다.The UIP / PC 4200 receives input from the UI 4100, (a) converts user input to numeric values, (b) interprets user input, sets parameter values, and generates CSA data structures. Control access, use, modification, or overriding, and (c) direct access and loading of data streams from audio or audio-visual processing by sending stream data requests to digital storage 4075 (performing playback control). do). In the case of digital storage 4075, UIP / PC 4200 may request to access a digital data file representing audio or audio-visual processing stored to a file system on the device. In order to direct access and loading of the data stream from audio or audio-visual processing, the UIP / PC 4200 uses a user input and a location of the digital sample representing the audio or audio-visual processing stored on the digital storage 4075. Interpret to calculate the playback position for the selected file in a particular sample.

디지털 저장장치 (4075) 는 다음을 입력으로써 수신한다 (a) UIP/PC (4200) 로부터 스트림 데이터 요청, 및 선택적으로 (b) TSM 서브시스템 (4300) 으로부터 TSM 된 출력, 및 선택적으로 (c) CSA 자료구조 발생기 (4500, CSADA 발생기) 로부터 CSA 자료구조를 나타내는 데이터 스트림 . 디지털 저장장치 (4075) 는 출력으로 다음을 생성한다 (a) 오디오 또는 오디오-비쥬얼 처리를 나타내는 데이터 스트림, 및 (b) 출력되는 데이터 스트림의 위치 정보 스트림, 예컨대 파일상의 위치. 디지털 저장장치, 예컨대 "하드 디스크 드라이브" 를 사용하여 범용 데이터를 저장하고 검색하는 방법은 기술분야의 당업자에게 공지되어 있다.Digital storage 4075 receives as input (a) a stream data request from UIP / PC 4200, and optionally (b) a TSM output from TSM subsystem 4300, and optionally (c) Data stream representing a CSA data structure from a CSA data structure generator (4500, CSADA generator). Digital storage 4075 produces as output (a) a data stream representing audio or audio-visual processing, and (b) a location information stream of the output data stream, such as a location on a file. Methods of storing and retrieving general purpose data using digital storage devices such as "hard disk drives" are known to those skilled in the art.

오디오 또는 오디오-비쥬얼 처리는 전형적으로 디지털 형태로 디저털 저장장치 (4075) 상에 저장된다. 디지털 저장장치 (4075) 의 실시예는 도 1 에 대하여 전술한 디지털 저장장치 (75) 와 같은 것이다. 디지털 저장장치 (4075) 는 기술분야의 당업자에게 공지된 방법에 따라 UIP/PC (4200) 로부터 데이터 요청을 수신하여, 오디오 및/또는 오디오-비쥬얼 처리를 나타내는 디지털 샘플의 스트림을제공한다. 대체 실시예로서는, 오디오 또는 오디오-비쥬얼 처리는 아날로그 형태로 아날로그 저장장치상에 저장된다. 그러한 대체 실시예에 있어서는, 아날로그 샘플을 디지털 샘플로 변환하는 도시되지 않은 장치로 아날로그 신호의 스트림이 입력된다. 목소리 신호와 같은 입력된 아날로그 신호를 수신하여, 아날로그 신호를 적어도 나이키스트 율로 샘플링하여 충실도의 손실없이 다시 아날로그 신호로 변환될 수 있는 디지털 신호의 스트림을 제공하는, 기술분야의 당업자에게 공지된 많은 상용 장치가 있다. 다음, 디지털 샘플은 TSM 서브시스템 (4300) 으로 전송된다.Audio or audio-visual processing is typically stored on digital storage 4075 in digital form. An embodiment of digital storage 4075 is the same as digital storage 75 described above with respect to FIG. 1. Digital storage 4075 receives data requests from UIP / PC 4200 in accordance with methods known to those skilled in the art, and provides a stream of digital samples indicative of audio and / or audio-visual processing. In an alternative embodiment, audio or audio-visual processing is stored on analog storage in analog form. In such alternative embodiments, a stream of analog signals is input to an unshown device that converts analog samples into digital samples. Many commercially known to those skilled in the art that receive input analog signals, such as voice signals, and provide a stream of digital signals that can be sampled at least at the Nyquist rate and converted back to analog signals without loss of fidelity. There is a device. The digital sample is then sent to the TSM subsystem 4300.

TSM 서브시스템 (4300) 은 입력으로써 다음을 수신한다 (a) 디지털 저장장치 (4075) 로부터의 오디오 또는 오디오-비쥬얼 처리의 부분들을 나타내는 샘플들의 스트림, (b) 예컨대, 샘플 카운트 또는 시간값 등의 송출되는 샘플의 데이터 스트림상의 위치를 확인하는데 사용되는 디지털 저장장치 (4075) 로부터의 스트림 위치정보, 및 (c) TSM 컨셉 모니터 (4400) 로부터의 요구되는 TSM 율 또는 재생 속도. TSM 서브시스템 (4300) 의 출력은 다음의 입력으로 적용된다 : (a) 디지털아날로그 변환기/오디오 및/또는 오디오-비쥬얼 재생장치 (4600, DA/APD), 및 선택적으로 (b) 요구된다면, TSM 된 출력, 즉 LIF 도구의 저장을 위한 디지털 저장장치 (4075). DA/APD (4600) 는 디지털 샘플을 수신하고 오디오 또는 오디오-비쥬얼 처리를 구축하는 기술분야의 당업자에게 공지된 장치이다. 본 발명에 따르면, TSM 서브시스템 (4300) 의 출력은, TSM 컨셉 모니터 (4400) 로부터 그 재생 속도가 공급되어, 사용자의 현재의 TSM 율 요구사항에 대하여 사용자에게 피드백을 제공하는, 오디오 또는 오디오-비쥬얼 처리를 나타내는 디지털 샘플들의 스트림이다. 사용자는 TSM 된 출력을 청취하고, UI (4100) 를 사용하여 입력을 더 제공함으로써 TSM 속도 또는 재생 속도를 변경할 수 있다. 또한, 사용자가 방금 재생된 오디오 또는 오디오-비쥬얼 처리의 일부분을 가속 또는 감속하고자 한다면 (아직 재생되지 않은 동일한 컨셉 식별자를 갖는 다른 부분들을 가속 또는 감속하고자 한다면), 사용자는 UI (4100) 를 사용하여 입력을 제공하여 오디오 또는 오디오-비쥬얼 처리를 원하는 부분으로 되감고, 변경된 TSM 또는 재생 속도로 다시 재생할 수도 있다 (또는 다른 부분들에 대한 요구되는 TSM 율 또는 재생 속도를 지정할 수도 있다). 이러한 방법으로, 사용자는 오디오 또는 오디오-비쥬얼 처리의 각 부분에 대한 요구되는 TSM 율 또는 재생 속도를 결정한다. TSM 서브시스템 (4300) 및 DA/APD (4600) 의 실시예는 도 1 에 대하여 전술한 TSM 서브시스템 (300) 및 DA/APD (600) 과 동일한 것이다. 기술분야의 당업자가 이해할 수 있는 바와 같이, 실시예 (4000) 가 오디오-비쥬얼 처리에 대한 재생을 제공할 때 마다, TSM 서브시스템 (4300) 은 비쥬얼 정보를 가속 또는 감속하여 오디오-비쥬얼 처리의 오디오과 매칭시킨다. 바람직한 실시예에서는 이러한 것을 행하기 위하여, 기술분야의 당업자에게 공지된 많은 방법중 어느 하나에 따라 비디오신호는 "프레임-서브샘플화 (frame-subsampled)" 또는 "프레임-복제화(frame-replicated)" 되어, 오디오와 오디오-비쥬얼 처리의 비쥬얼 부분간의 동기화를 유지한다. 따라서, 오디오을 가속시키고, 더 빠른 속도의 샘플링이 요청된다면, 프레임 스트림은 서브샘플 된다, 즉 프레임 건너뛰기 (skip) 가 행해진다.TSM subsystem 4300 receives as input: (a) a stream of samples representing portions of audio or audio-visual processing from digital storage 4075, (b) a sample count or time value, or the like. Stream location information from the digital storage device 4075 used to identify the location on the data stream of the sample being sent, and (c) the required TSM rate or playback rate from the TSM concept monitor 4400. The output of the TSM subsystem 4300 is applied to the following inputs: (a) digital analog converter / audio and / or audio-visual playback device 4600, DA / APD, and optionally (b) TSM, if required. Digital storage for storage of the output, ie LIF tool (4075). DA / APD 4600 is a device known to those skilled in the art for receiving digital samples and establishing audio or audio-visual processing. In accordance with the present invention, the output of the TSM subsystem 4300 is supplied with its playback rate from the TSM concept monitor 4400 to provide feedback to the user regarding the user's current TSM rate requirements. A stream of digital samples representing visual processing. The user can change the TSM speed or playback speed by listening to the TSM-enabled output and using the UI 4100 to provide further input. Also, if the user wants to accelerate or slow down a portion of the audio or audio-visual processing that has just been played back (if he wants to speed up or slow down other parts with the same concept identifier not yet played), the user can use the UI 4100. An input may be provided to rewind the audio or audio-visual processing to the desired portion, and may be played back at the changed TSM or playback speed (or may specify the required TSM rate or playback speed for other parts). In this way, the user determines the required TSM rate or playback speed for each part of the audio or audio-visual process. Embodiments of the TSM subsystem 4300 and the DA / APD 4600 are the same as the TSM subsystem 300 and the DA / APD 600 described above with respect to FIG. 1. As will be appreciated by those skilled in the art, whenever the embodiment 4000 provides playback for audio-visual processing, the TSM subsystem 4300 may accelerate or decelerate the visual information to provide audio and audio for the audio-visual processing. Match In order to do this in a preferred embodiment, the video signal is " frame-subsampled " or " frame-replicated " according to any one of many methods known to those skilled in the art. To maintain synchronization between the audio and the visual part of the audio-visual process. Thus, if the audio is accelerated and a faster rate of sampling is required, the frame stream is subsampled, i.e. a frame skip is performed.

컨셉 결정기 (4700) 는 입력으로서 특정 옵션에 따라 다른 데이트 셋트를 받아들인다. 옵션 1 에 따르면, 입력 데이터는 예컨대 TSM 서브 시스템 (4300) 으로 공급되고 있는 입력 오디오 또는 오디오-비쥬얼 처리의 현재 세그먼트와 함께 저장된, 폐쇄형 캡션 데이터 또는 택스트형 주해 (textual annotation) 와 같은, 택스트 또는 컨셉을 나타내는 데이터 스트림을 포함한다. 옵션 1 의 경우에는, 컨셉 결정기 (4700) 는 들어오는 택스트 또는 컨셉을 나타내는 데이터 스트림을 출력으로서 컨셉 디코더 (4800) 로 전달한다. 옵션 2 에 따르면, 입력 데이터는 다음을 포함한다 : (a) 디지털 저장장치 (4075) 로부터의 오디오 또는 오디오-비쥬얼 처리의 부분들을 나타내는 샘플들의 스트림, 및 (b) 예컨대, 디지털 저장장치 (4075) 로부터 전송되는 샘플 그룹의 시작에 대한 시간값 또는 샘플카운트 등의 전송되고 있는 샘플 스트림의 위치를 파악하는데 사용되는, 디지털 저장장치 (4075) 로부터의 현재의 스트림 위치정보. 옵션 2 의 경우, 컨셉 결정기 (4700) 는 출력으로 TSM 서브시스템 (4300) 으로 공급되고 있는 오디오 또는 오디오-비쥬얼 처리의 현재 부분에 포함된 컨셉을 나타내는 데이터 스트림을 제공한다. 말로 된 문장의 컨셉 및/또는 택스트화는 오디오 또는 오디오-비쥬얼 처리로부터 폐쇄형 캡션 정보를 추출함으로써 또는 입력 오디오 또는 오디오-비쥬얼 처리로부터 택스트의 스트림을 획득하기 위한 음식 인식 알고리즘을 사용하여, 결정된다. 폐쇄형 캡션 정보를 추출하는 방법 및 음성 인식 알고리즘을 사용하여 택스트를 추출하는 많은 방법들은 기술분야이 당업자에게 공지되어 있다.Concept determiner 4700 accepts different data sets according to specific options as input. According to option 1, the input data is text or text, such as closed caption data or textual annotation, stored with the current segment of input audio or audio-visual processing, for example, that is being fed to the TSM subsystem 4300. Contains a data stream representing the concept. In the case of option 1, concept determiner 4700 forwards the incoming text or data stream representing the concept as output to concept decoder 4800. According to option 2, the input data includes: (a) a stream of samples representing portions of audio or audio-visual processing from digital storage 4075, and (b) digital storage 4075, for example. Current stream position information from digital storage 4075, used to locate the position of the sample stream being transmitted, such as the time value for the start of a group of samples transmitted from or a sample count. For option 2, concept determiner 4700 provides as output the data stream representing the concept included in the current portion of the audio or audio-visual processing being fed to TSM subsystem 4300. The concept and / or textification of the spoken sentence is determined by extracting closed caption information from the audio or audio-visual processing or using a food recognition algorithm to obtain a stream of text from the input audio or audio-visual processing. . Methods of extracting closed caption information and many methods of extracting text using a speech recognition algorithm are known to those skilled in the art.

캡션 정보 디코더 (4800) 는 입력으로 컨셉 결정기 (4700) 로부터 컨셉 정보를 나타내는 데이터 스트림을 받아들인다. 본 발명에 따르면, 제한은 없지만, 컨셉 정보는 다음을 포함한다 : 받아쓰기된 문장, 실제의 택스트, 키워드, 어구, 또는 기술분야의 당업자에게 공지된 컨셉 정보의 다른 표현. 응답하여, 컨셉 정보 디코더 (4800) 는 출력으로 TSM 서브시스템 (4300) 으로 송출되고 있는 입력 오디오 또는 오디오-비쥬얼 처리의 현재 부분에 대한 키워드 및 컨셉을 나타내는 데이터 스트림을 발생시킨다.Caption information decoder 4800 accepts as input a data stream representing concept information from concept determiner 4700. According to the present invention, without limitation, concept information includes: dictated sentences, actual text, keywords, phrases, or other representations of concept information known to those skilled in the art. In response, concept information decoder 4800 generates as output a data stream representing keywords and concepts for the current portion of the input audio or audio-visual processing being sent to TSM subsystem 4300.

컨셉 정보 디코더 (4800) 는 입력을 처리하여 입력 데이터 스트림의 컨셉 데이터 표현을 형성시킨다. 예컨대, 컨셉 정보 디보더 (4800) 는 단순히 택스트화된 문장을 나타내는 입력으로부터 형용사나 관사를 제거하여 명사와 명사구만으로 구성된 출력을 제공할 수도 있다. 대체예로서, 컨셉 정보 디코더 (4800) 는 자연 언어 처리를 채용하여, 말해진 단어들의 스트림으로부터 컨셉의 콘탠트를 추출할 수도 있다. 컨셉 정보 디코더 (4800) 를 구현하는 많은 방법은 기술분야의 당업자에게 공지되어 있다. 예컨대, 각 벡터의 요소가 전체 데이터 세트의 속성 (attributes) 과 관련된 특정 특성 (property) 또는 값을 나타내는, 다중차원 (multidimensional) 벡터의 데이터 세트를 전개 (developing) 하는, 클러스터링으로 알려진 기술을 사용하는 많은 시스템이 있다. 클러스터링은 벡터간의 N-차원 유클리디안 거리에 기초하여 컨셉의 분류 및 그룹화를 하도록 해준다. 클러스터링 된 데이터 세트의 객체는 명백히 어느 하나의 클러스터에도 포함되지 않는 일이 종종 있을 수 있으며, 이러한 경우, 객체는 하나 이상의 클러스터와 관련될 수 있다. 그러한 경우에는, 객체가 각 가능한 클러스터의 멤버일 확률을 나타내는데 유클리디안 거리가 사용될 수 있다. 예컨대, 미시시피 주립대학으로 제출된 박사학위 논문 "Semantic Feature Extraction from Technical Texts with Limited Human Intervention" (by Rajeev Agarwal 1995) 을 참조하기 바란다.Concept information decoder 4800 processes the input to form a concept data representation of the input data stream. For example, the concept information deborder 4800 may simply remove adjectives or articles from input representing text that has been texted to provide an output consisting of only nouns and noun phrases. Alternatively, concept information decoder 4800 may employ natural language processing to extract the content of the concept from the stream of spoken words. Many methods for implementing the concept information decoder 4800 are known to those skilled in the art. For example, using a technique known as clustering, which develops a data set of multidimensional vectors, in which each element of the vector represents a particular property or value associated with the attributes of the entire data set. There are many systems. Clustering allows classification and grouping of concepts based on the N-dimensional Euclidean distance between vectors. An object of a clustered data set can often be obviously not included in any one cluster, in which case the object can be associated with one or more clusters. In such a case, Euclidean distance can be used to indicate the probability that the object is a member of each possible cluster. See, eg, the Ph.D. dissertation, "Semantic Feature Extraction from Technical Texts with Limited Human Intervention" (by Rajeev Agarwal 1995), submitted to Mississippi State University.

TSM 컨셉 모니터 (440) 는 입력으로 다음을 수신하여 CSA 자료구조를 발생시키는 실시예 (4000) 를 가이드한다 : (a) 요구되는 TSM 율 또는 재생 속도로 UIP/PC (4200) 에 의해 번역된 사용자 입력 (요구되는 TSM 율, 또는 재생 속도는 인지되고 있는 오디오 또는 오디오-비쥬얼 처리의 일 부분에 대한 TSM 율 또는 재생 속도의 변화를 나타낼 수도 있다) (b) 현재 TSM 서브시스템 (4300) 으로 송출되고 있는 입력 오디오 또는 오디오-비쥬얼 처리의 부분에 대한 컨셉을 나타내는 컨셉 정보 디코더 (4800) 로부터의 데이터, 및 (c) UIP/PC (4200) 로부터의 파라미터 Speed_Change_Resolution.TSM concept monitor 440 guides embodiment 4000 to generate a CSA data structure by receiving as input: (a) a user translated by UIP / PC 4200 at the required TSM rate or playback rate. Input (the required TSM rate, or playback rate, may indicate a change in TSM rate or playback rate for a portion of the perceived audio or audio-visual process) (b) currently sent to the TSM subsystem 4300 Data from a concept information decoder 4800 representing the concept of a portion of the input audio or audio-visual processing that is present, and (c) the parameter Speed_Change_Resolution from the UIP / PC 4200.

TSM 컨셉 모니터 (4400) 는 사용자로부터 요청된 TSM 율 또는 재생속도, 및 컨셉 정보를 처리하고, 그 입력으로서 제공되는 컨셉에 대한 단일의 TSM 율을 유도한다. 예컨대, 컨셉 정보 디코더 (4800) 로부터 출력되는 컨셉은, "파이낸셜 마켓" 과 같은 입력 컨셉이, 재생되고 있는 오디오 또는 오디오-비쥬얼 처리의 몇몇 단어 또는 구를 나타낼 수 있기 때문에, 몇 초간 변화되지 않고 유지될 수 있다. 이 때문에, 사용자는 단일의 컨셉에 관련된 입력 오디오 또는 오디오-비쥬얼 처리의 간격에 걸쳐 다수의 TSM 율을 요청할 수도 있다. 본 발명에 따른면, TSM 컨셉 모니터 (4400) 는 예컨대 단일의 컨셉에 관련된 입력 오디오 또는 오디오-비쥬얼 처리의 간격에 걸쳐 TSM 율의 산술적 평균을 수행함으로써 하나의컨셉에 대하여 단일의 TSM 율을 생성한다. 예컨대, TSM 컨셉 모니터 (4400) 로의 입력에 특정 컨셉이 존재하는 주기 동안 얻어진 가장 최근의 TSM 율을 강조하는 가중 평균 (weighted average) 이 사용될 수 있다. 이러한 것들은 사용될 수 있는 많은 다른 방법의 일례에 불과한 것을 이해하기 바란다.The TSM concept monitor 4400 processes the TSM rate or playback rate requested from the user, and concept information, and derives a single TSM rate for the concept provided as input. For example, the concept output from the concept information decoder 4800 remains unchanged for a few seconds because an input concept such as "financial market" may represent some words or phrases of audio or audio-visual processing being played. Can be. Because of this, a user may request multiple TSM rates over an interval of input audio or audio-visual processing related to a single concept. In accordance with the present invention, the TSM concept monitor 4400 generates a single TSM rate for one concept, for example by performing an arithmetic average of the TSM rate over the interval of input audio or audio-visual processing related to a single concept. . For example, a weighted average may be used that highlights the most recent TSM rate obtained during a period during which a particular concept is present at the input to the TSM concept monitor 4400. It is to be understood that these are merely examples of the many different methods that can be used.

TSM 컨셉 모니터 (4400) 는 파라미터 Speed_Change_Resolution 을 사용하여 적절한 TSM 율을 결정하여, TSM 서브시스템 (4300) 및 CSADS 발생기 (4500) 로 전달한다. 특정 컨셉에 대하여 결정된 TSM 율은 기술분야의 당업자에게 공지된 방법으로 양자화된 레벨중 하나로 변환된다. 이것은 입력의 요구되는 TSM 율이 양자화된 레벨간의 차이를 초과하는 량으로 변화하는 경우에만 출력 TSM 율 또는 재생속도가 변화할 수 있음을, 즉, Speed-Change_Resolution 및 가능한 TSM 율의 수는 자료구조의 효율적 표현을 위해 제한됨을 나타낸다. 파라미터 Speed_Change_Resolution 는 기술분야의 당업자에게 공지된 방법에 따라 실시예 (4000) 에 대하여 소정의 파라미터로 설정될 수 있으며, 또는 기술분야의 당업자에게 공지된 방법에 따라 UI (4100) 를 통하여 사용자 입력을 수신함으로써 입력 및/또는 변경될 수 있다. 그러나, 이러한 파라미터가 설정 및/또는 변경되는 방법은 본 발명의 이해의 용이를 위하여 도시되지 않았다.The TSM concept monitor 4400 uses the parameter Speed_Change_Resolution to determine the appropriate TSM rate and passes it to the TSM subsystem 4300 and the CSADS generator 4500. The TSM rate determined for a particular concept is converted to one of the quantized levels by methods known to those skilled in the art. This means that the output TSM rate or playback rate can only change if the required TSM rate of the input changes by an amount that exceeds the difference between the quantized levels, that is, the Speed-Change_Resolution and the number of possible TSM rates are determined by the data structure. Limited for efficient representation. The parameter Speed_Change_Resolution may be set to predetermined parameters for the embodiment 4000 according to methods known to those skilled in the art, or may receive user input through the UI 4100 according to methods known to those skilled in the art. By input and / or change. However, how these parameters are set and / or changed is not shown for ease of understanding of the present invention.

TSM 컨셉 모니터 (4400) 는 출력으로서 (a) 단일의 TSM 율 값 및 (b) 컨셉 정보를 생성한다. TSM 율은 TSM 서브시스템 (4300) 과 컨셉 속도관계 자료구조 발생기 (4500, CSADS 발생기) 로 입력으로써 적용되며, 컨셉 정보는 CSADS 발생기 (4500) 의 입력으로 적용된다. 다음은 단지 본 발명의 이해를 위하여 평균을사용하여 하나의 컨셉에 단일의 TSM 율을 결정하는 실시예를 설명하는 것임은 당업자에게 자명할 것이다. 그러나, 또한 본 발명은 TSM 율을 결정하여 컨셉과 관련시키는 어느 하나의 알고리즘에 제한되지 않으며, 본 발명의 실시예는 하나의 컨셉에 단일의 TSM 율을 관련시키는데 제한되지 않음은 기술분야의 당업자에게는 자명할 것이다. 예컨대, 컨셉과 관련된 TSM 율은 변화할 수 있는데, 예컨대 재생중 가속하여 사용자가 컨셉에 매우 친숙하다는 사실을 반영할 수 있으며, 처리의 재생중 컨셉이 반복되는 만큼 정보를 이해하는데 많은 시간이 필요하지 않다는 사실을 반영할 수도 있다.The TSM concept monitor 4400 produces (a) a single TSM rate value and (b) concept information as output. The TSM rate is applied as input to the TSM subsystem 4300 and the concept velocity relationship data structure generator 4500 (CSADS generator), and the concept information is applied to the input of the CSADS generator 4500. It will be apparent to those skilled in the art that the following merely describes an embodiment for determining a single TSM rate in one concept using the mean for the understanding of the present invention. However, the present invention is also not limited to any one algorithm that determines the TSM rate and associates it with the concept, and embodiments of the present invention are not limited to associating a single TSM rate with one concept. Will be self explanatory. For example, the TSM rate associated with a concept can change, for example, to accelerate during playback to reflect the fact that the user is very familiar with the concept, and not much time is needed to understand the information as the concept is repeated during playback of the process. It may also reflect the fact that it is not.

CSADS 발생기 (4500) 는 TSM 컨셉 모니터 (4400) 로부터 다음을 입력으로서 받아들인다 : (a) 컨셉 정보, (b) 컨셉에 대한 TSM 율 또는 재생속도, 및 (c) CSA 자료구조를 생성하는 프로세스를 제어하는데 사용되는 UIP/PC (4200) 으로부터의 파라미터 (Refine_or_Average, Theta, 및 Sigma) 값. 이러한 자료구조를 구현하는 많은 방법은 기술분야의 당업자에게 공지되어 있다.The CSADS generator 4500 accepts the following inputs from the TSM concept monitor 4400: (a) concept information, (b) the TSM rate or refresh rate for the concept, and (c) the process of generating the CSA data structure. Parameter (Refine_or_Average, Theta, and Sigma) values from UIP / PC 4200 used to control. Many methods for implementing such data structures are known to those skilled in the art.

예컨대, CSA 자료구조는 적절한 TSM 값을 수반하는 일련의 관련된 키워드 어구, 또는 컨셉으로 구현될 수도 있다.For example, a CSA data structure may be implemented as a series of related keyword phrases, or concepts, involving appropriate TSM values.

(("stock", "bonds", "stock market", "wall stree", "currency") 0.8)(("stock", "bonds", "stock market", "wall stree", "currency") 0.8)

(("Hollywood", "actor", "movie") 1.5)(("Hollywood", "actor", "movie") 1.5)

여기서, 제 1 컨셉 그룹의 TSM 율은 0.8 이며, 제 2 컨셉 그룹의 TSM 율은 1.5 이다. 이러한 자료구조는 증권 시황 및 다른 파이낸셜 컨셉에 대한 정보를 감속된 재생속도 (0.8) 로 듣고, Hollywood movie 에 대한 정보는 가속된 재생속도(일반 재생속도의 1.5) 로 제공되어야함을 지정하고자 하는 청취자의 요구를 나타낸다.Here, the TSM rate of the first concept group is 0.8 and the TSM rate of the second concept group is 1.5. This data structure allows listeners to hear information about stock market and other financial concepts at reduced playback speed (0.8) and to specify that information about Hollywood movies should be provided at an accelerated playback speed (1.5 of normal playback speed). Indicates the need.

CSADS 발생기 (4500) 는 데이터 베이스 또는 스크래치-패드 메모리를 사용하여, 각 레코드가 컨셉과 그 컨셉에 관련된 TSM 율에 해당하는 정보를 저장하는, 레코드 리스트를 유지하도록 한다. 도 9 는 CSA 자료구조를 발생시키기 위하여 CSADS 발생기 (4500) 를 사용하는 일실시예에 사용된 알고리즘의 플로차트를 나타낸다.CSADS generator 4500 uses a database or scratch-pad memory to maintain a record list, where each record stores information corresponding to the concept and the TSM rate associated with that concept. 9 shows a flowchart of an algorithm used in one embodiment using a CSADS generator 4500 to generate a CSA data structure.

도 9 에 도시된 바와 같이, 컨셉 정보와 TSM 율은 박스 (9500) 의 입력으로 적용된다. 박스 (9500) 에서는, 검색이 수행되어 동일한 컨셉 정보를 담고 있는 데이터베이스내에 임의의 레코드를 위치시킨다. 다음, 박스 (9510) 로 넘어간다. 박스 (9510) 에서는, 박스 (9500) 에서 컨셉이 발견되었다면, 발견된 컨셉에 대한 잠재적인 매칭 리스트의 유사도을 반영하는 수치값이 결정된다. 컨셉을 나타내는 데이터 또는 2 개 단어 사이의 컨셉 거리가 기술분야의 당업자에게 공지된 어느 방법의 수를 사용하여서도 계산될 수 있다. 예컨대, 가장 간단한 형태로, 동의어 또는 다른 레퍼런스 데이터의 리스트가 거리를 연산하기 위하여 채용될 수 있다. 다른 방법에서는, 유클리디안 거리가 컨셉을 분류하기 위하여 클러스터링 알고리즘을 채용한 데이터 세트내의 다중 차원 벡터 객체의 유사성을 측정하는데 사용될 수 있다. 다른 방법에서는, 문장과 단어 어구의 의미를 파싱 (parse) 하는데 "헤드 구동의 어구 구조화 문법 (head-driven phrase structured grammar)" 가 통상적으로 사용된다. 다음, 박스 (9520) 로 넘어간다.As shown in FIG. 9, the concept information and the TSM rate are applied as input of box 9500. In box 9500, a search is performed to place any record in the database containing the same concept information. Next, go to box 9510. At box 9510, if a concept was found at box 9500, a numerical value is determined that reflects the similarity of the potential matching list for the found concept. Data representing the concept or concept distance between two words can be calculated using any number of methods known to those skilled in the art. For example, in its simplest form, a synonym or list of other reference data may be employed to calculate the distance. Alternatively, Euclidean distance can be used to measure the similarity of multidimensional vector objects in a data set employing a clustering algorithm to classify concepts. In another method, a "head-driven phrase structured grammar" is commonly used to parse the meaning of sentences and word phrases. Next, go to box 9520.

박스 (9520) 에서는, 가장 근접한 매칭을 이루는 레코드가 주어진 파라미터 Theta 량의 범위내인지 결정이 이루어진다. 가장 급접한 매칭이, 주어진 Theta 량 범위내이면, 박스 (9530) 으로 넘어가고, 그렇지 않으면, 박스 (9590) 으로 넘어간다.At box 9520, a determination is made whether the record with the closest match is within the range of the given parameter Theta amount. If the closest match is within the given Theta amount range, go to box 9930; otherwise, go to box 9590.

박스 (9530) 에서는, 파라미터 Refine_or_Average 가 "Refine" 또는 "Average" 와 같은지 결정이 이루어진다. Refine_or_Average 가 "Refine" 이라면, 박스 (9540) 으로 넘어간다. Refine_or_Average 가 "Average" 라면, 박스 (9580) 으로 넘아간다.At box 9530, a determination is made whether the parameter Refine_or_Average is equal to "Refine" or "Average". If Refine_or_Average is "Refine", proceed to box 9540. If Refine_or_Average is "Average", go to box 9580.

박스 (9580) 에서는, 특정 컨셉에 대하여 저장된 TSM 값이 CSA 자료구조의 기존 TSM 값과 현재 저장된 TSM 율의 산술적 평균을 연산함으로써 업데이트 된다. 다음 박스 (9570) 으로 넘어간다.In box 9580, the stored TSM value for a particular concept is updated by calculating the arithmetic mean of the existing TSM value of the CSA data structure and the currently stored TSM rate. Proceed to next box 9570.

박스 (9590) 에서는, 데이터베이스에 새로운 레코드가 생성된다. 다음, 박스 (9600) 으로 넘어간다. 박스 (9600) 에서는, CSA 자료구조의 값이 다음과 같이 인스톨 (install) 된다 : (a) 현재의 컨셉이 컨셉 필드에 저장된다, (b) 현재의 TSM 율이 TSM 필드에 저장된다. 다음 박스 (9570) 으로 넘어간다.At box 9590, a new record is created in the database. Next, go to box 9600. In box 9600, the value of the CSA data structure is installed as follows: (a) The current concept is stored in the concept field, (b) The current TSM rate is stored in the TSM field. Proceed to next box 9570.

박스 (9540) 에서는, 가장 근접한 매칭을 이루는 레코드의 TSM 율과 현재의 TSM 율간의 차이를 비교하는 결정이 이루어진다. 파라미터 Sigma 보다 큰 차가 있다면, 박스 (9560) 로 넘어가고, 그렇지 않으면, 박스 (9570) 으로 넘어간다. 박스 (9560) 에서는, 컨셉을 더욱 특정화하고 좁혀, CSA 자료구조의 기존 컨셉과구별되도록 하기 위하여, 이전의 컨셉을 현재 컨셉에 부속시킴 (appending) 으로써 현재의 컨셉 또는 키워드 어구가 좁아진다. 예컨대, 본 발명의 바람직한 실시예에서는, "bond" 라는 컨셉 또는 키워드가 파이낸셜 정보에 해당하는 CSA 자료구조 레코드에 포함될 수 있다. 즉, 파이낸셜 정보에 해당하는 컨셉 필드는 ("money", "stock", "bond") 를 포함할 수 있다. 입력 오디오 또는 오디오-비쥬얼 처리가 "actor James Bond" 라는 어구를 포함하였고, 청취자가 이 어구중에 재생을 지속적으로 가속시켜, TSM 율이 파이낸셜 정보 컨셉 필드에 해당하는 TSM 필드내의 값과 Sigma 이상 차이가 난다면, "bond" 라는 컨셉 또는 키워드는 기존의 컨셉 또는 키워드, 이 경우 "James" 에 부기 (prefixed) 될 수 있다. 다음, 도시된 바와 같이 박스 (9500) 으로 넘어가, 다시 이 새로운 컨셉 "James Bond" 를 사용하여 데이터베이스가 검색된다. 본 발명의 이 실시예에 따르면, 키워드 "bond" 에 대하여 다른 엔트리가 발생된다. 하나의 엔트리는 파이낸셜 리포트의 문맥에 이를 사용하는 것에 해당하며, 다른 엔트리는 "James" 라는 이름에 쌍을 이루게 될 때의 사용에 해당한다. 박스 (9570) 에서는, 새로 생성 또는 업데이트 된 레코드가 데이터베이스에 저장된다.At box 9540, a determination is made to compare the difference between the TSM rate of the closest matching record and the current TSM rate. If there is a difference greater than the parameter Sigma, go to box 9560; otherwise, go to box 9570. In box 9560, the current concept or keyword phrase is narrowed by appending the previous concept to the current concept in order to further characterize and narrow the concept and distinguish it from the existing concept of the CSA data structure. For example, in a preferred embodiment of the present invention, a concept or keyword "bond" may be included in the CSA data structure record corresponding to the financial information. That is, the concept field corresponding to the financial information may include ("money", "stock", "bond"). The input audio or audio-visual processing included the phrase "actor James Bond", and the listener continued to accelerate playback in this phrase, so that the TSM rate differed from the value in the TSM field corresponding to the financial information concept field by more than Sigma. If so, the concept or keyword "bond" may be prefixed to an existing concept or keyword, in this case "James". Next, proceed to box 9500, as shown, and search the database again using this new concept "James Bond". According to this embodiment of the present invention, another entry is generated for the keyword "bond". One entry corresponds to using it in the context of a financial report, and the other entry is used when paired with the name "James." In box 9570, the newly created or updated record is stored in the database.

본 발명의 다른 실시예에서는, CSA 자료 구조가 전술한 실시예 (4000) 를 사용하지 않고 발생될 수 있다. 대신, 예컨대 택스트 에디터를 사용하여 데이터를 구조로 입력함으로써 또는 관심의 컨셉에 대한 앙케이트를 채워넣음으로써 CSA 자료구조가 발생될 수 있다. 이러한 CSA 자료구조는 사용자가 사전에 듣게하지 않고도 오디오 또는 오디오-비쥬얼 처리로부터 LIF 도구를 생성하는데 사용될 수있다. 동일한 방법으로, CSA 자료구조는 "온라인" 검색엔진에 전형적으로 제공되는 형태의 키워드 또는 어구를 사용하여 구축되고, 데이터 검색을 제어하는데 사용될 수 있다.In another embodiment of the present invention, a CSA data structure may be generated without using the embodiment 4000 described above. Instead, a CSA data structure can be generated, for example, by entering data into a structure using a text editor or by filling in a questionnaire for a concept of interest. This CSA data structure can be used to generate LIF tools from audio or audio-visual processing without the user listening in advance. In the same way, CSA data structures can be constructed using keywords or phrases in the form typically provided in "online" search engines, and used to control data retrieval.

CSA 자료구조는 또한 검색엔진으로 검색된 오디오 또는 오디오-비쥬얼 처리의 재생속도를 제어하여, 앞에서 듣지못한 검색엔진으로 검색된 오디오 또는 오디오-비쥬얼 처리로부터 LIF 도구를 생성하는데 사용될 수도 있다. 그러한 일실시예에서는, 검색엔진으로 입력된 사용자가 지정한 검색 범주를 사용하여 CSA 자료구조가 얻어진다. 예컨대, "all boats excluding yachts" 를 요청하는 검색엔진으로의 사용자 입력은, 보트에 대한 정보는 정상 속도로 재생하지만, 요트에 대한 항목에서는 가속시키거나 배제하는 LIF 도구를 생성할 것이다. 상세한 설명의 조명으로, 예컨대 검색엔진으로부터 전송되는 정보를 사용하여 CSA 자료구조를 발생시키는 방법은 기술분야의 당업자에게는 자명한 것이다.The CSA data structure can also be used to control the playback speed of audio or audio-visual processing retrieved by a search engine, thereby creating a LIF tool from audio or audio-visual processing retrieved by a search engine not heard previously. In one such embodiment, a CSA data structure is obtained using a user specified search category entered into a search engine. For example, user input to a search engine requesting "all boats excluding yachts" would create a LIF tool that reproduces information about boats at normal speed but accelerates or excludes items for yachts. In light of the detailed description, it is apparent to those skilled in the art how to generate CSA data structures using, for example, information transmitted from a search engine.

본 발명의 또 다른 실시예에서는, CSA 자료구조는 TSM 율 엔트리, 예컨대, 특정 컨셉 또는 키워드에 대한 친화도를 포함할 수 있다. 본 발명의 이러한 실시예에서는, "친화도" (유사하게 번역될 수 있는 다른 표현) 의 TSM 율은 그 컨셉이 친화도에 해당하는 TSM 율를 갖는 오디오 또는 오디오-비쥬얼 처리의 섹션을 뛰어넘어가도록 (skip) 재생시스템을 지시한다. 이러한 실시예에 따르면, 사용자는 오디오 또는 오디오-비쥬얼 처리를 청취 또는 검색할 때 특정 컨셉 또는 키워드에 "관심이 없음"을 지정할 수 있다. 예컨대, 사용자는 다음의 CSA 자료구조를 야간 뉴스방송 청취중 지정할 수 있다.In another embodiment of the present invention, the CSA data structure may include affinity for a TSM rate entry, eg, a particular concept or keyword. In this embodiment of the invention, the TSM rate of "affinity" (another expression that may be similarly translated) is such that the concept goes beyond a section of audio or audio-visual processing with a TSM rate corresponding to affinity ( skip) Indicates a playback system. According to this embodiment, the user may designate "no interest" in a particular concept or keyword when listening or searching for audio or audio-visual processing. For example, the user may specify the following CSA data structure while listening to nightly newscasts.

(("weather", "partly cloudy", "weather forecast", "temperature", "dew point")(("weather", "partly cloudy", "weather forecast", "temperature", "dew point")

("infinity")("infinity")

(("stock", "bond", "stock market", "wall street", "currency") 0.8)(("stock", "bond", "stock market", "wall street", "currency") 0.8)

(("Hollywood", "actor", "movie") 1.5)(("Hollywood", "actor", "movie") 1.5)

이러한 CSA 자료구조는 (a) 방송중 기상예보 및 온도의 리포트를 넘어가고, (b) 파이낸셜 정보를 정상 재생속도의 0.8 배로 재생하고, (c) 헐리우드 영화와 배우에 대한 정보를 정상 재생속도의 1.5 배로 TSM 율을 증가시킴으로써 가속시켜 재생할 것을 지시한다.This CSA data structure includes (a) reporting of weather forecasts and temperatures during broadcast, (b) reproducing financial information at 0.8 times normal playback speed, and (c) reproducing information about Hollywood movies and actors at 1.5 times normal playback speed. Increasing TSM rate instructs playback to accelerate.

본 발명의 실시예는 정적인 (static) CSA 자료구조에 제한되지 않으며, 이하 서술되는 바와 같이, 사용자는 재생중 입력을 제공하여 TSM 율을 갱신할 수도 있다. 예컨대, 다음과 같이 CSA 자료구조가 엔트리를 포함하였고,Embodiments of the present invention are not limited to static CSA data structures, and as described below, a user may provide input during playback to update the TSM rate. For example, the CSA data structure contains an entry,

(("Hollywood", "actor", "movie")1.5)(("Hollywood", "actor", "movie") 1.5)

입력에서 "actor, James Bond" 라는 어구가 발생하였을 때, 사용자가 재생속도를 계속적으로 간섭하였다면, 본 발명의 실시예 (4000) 은 변경된 자료구조가 이하와 같이 되도록 새로운 엔트리를 추가함으로써 CSA 자료구조에 대하여 변화 또는 갱신을 할 수도 있다.When the phrase "actor, James Bond" occurs in the input, and the user continues to interfere with the playback speed, embodiment 4000 of the present invention adds a new entry so that the modified data structure is as follows. You can also make changes or updates to.

(("Hollywood", "actor", "movie") 1.5)(("Hollywood", "actor", "movie") 1.5)

(("actor James Bone") 2.0)(("actor James Bone") 2.0)

이러한 방식으로, 기존의 CSA 자료구조를 사용하여 새로운 자료 및 새로운 컨셉을 청취하면서 사용자 관심도를 반영하도록 CSA 자료구조가 계속적으로 갱신될 수 있다.In this way, the CSA data structure can be continuously updated to reflect user interest while listening to new material and new concepts using existing CSA data structures.

쉽게 이해할 수 있는 바와 같이, CSA 자료구조이 사용은 TSM 율에 제한되지 않으며, 실제로는, 전술한 바와 같이, TSM 율의 제 1 도함수 (derivative) 가 동일한 결과에 영향을 미치도록 사용될 수도 있다. 예컨대, 사용자가 "free sample" 이란 단어를 들을 때, 지속적으로 감속한다면, 사전에 듣지 못한 자료중의 재생 속도를 제어하는데, TSM 율 자체 보다 TSM 율 변화를 저장하는 CSA 자료구조가 동일하게 유용할 것이다.As can be readily appreciated, the use of the CSA data structure is not limited to the TSM rate, and in practice, as described above, the first derivative of the TSM rate may be used to affect the same result. For example, if the user continues to slow down when hearing the word "free sample", a CSA data structure that stores changes in the TSM rate rather than the TSM rate itself would be equally useful for controlling the rate of playback of previously unlisted data. will be.

도 8 은 별도의 모듈을 구비하여야하는 실시예 (4000) 를 나타내지만, 바람직한 실시예에서는 UI (4100), UIP/PC (4200), TSM 서브시스템 (4300), TSM 컨셉 모니터 (4400), 컨셉 결정기 (4700), 컨셉 정보 디코더 (4800), 및 CSADS 발생기 (4500) 가 예컨대 PC 와 같은 범용 컴퓨터에 실행되는 소프트웨어 프로그램 또는 모듈로 구현될 수 있다. 또한, 디지털 저장장치 (4075) 는 디스크 드라이브 또는 RAM 으로 구현될 수 있으며, D/A 변환기 (4600) 는 PC 상의 사운드카드와 같은 범용 컴퓨터의 전형적인 액세서리 구현될 수 있다. 상세한 설명의 조명에서, 소프트웨어로 이러한 프로그램 또는 모듈을 구현하는 방법은 기술분야의 당업자에게 자명한 것이다.FIG. 8 shows an embodiment 4000 that should have a separate module, but in a preferred embodiment the UI 4100, the UIP / PC 4200, the TSM subsystem 4300, the TSM concept monitor 4400, the concept. The determiner 4700, the concept information decoder 4800, and the CSADS generator 4500 may be implemented as software programs or modules that run on a general purpose computer such as a PC, for example. In addition, digital storage 4075 may be implemented as a disk drive or RAM, and D / A converter 4600 may be implemented as a typical accessory of a general purpose computer such as a sound card on a PC. In the light of the detailed description, how to implement such a program or module in software will be apparent to those skilled in the art.

도 8 에 도시된 실시예 (4000) 는 특정 오디오 또는 오디오-비쥬얼 처리에대하여 사전에 발생된 속도 등고선을 그 처리에 대한 CSA 자료구조로 변환하도록 변형될 수 있다. 이러한 실시예에서는, 속도 등고선으로부터 TSM 율이 얻어지고 (UIP/PC (4200) 로부터의 사용자 TSM 율 값을 교체하도록), TSM 컨셉 모니터 (4400) 로 입력으로서 제공된다. 여기의 상세한 설명의 조명으로부터, 속도 등고선을 입력하고 TSM 율을 획득하는 방법은 기술분야의 당업자에게는 자명한 것이다. 동일하게, 도 1 에 도시된 실시예 (1000) 는 특정의 오디오 또는 오디오-비쥬얼 처리에 대하여 사전에 발생된 CSA 자료구조를 그 처리에 대한 속도등고선으로 변환하도록 변형될 수도 있다. 이러한 변형예에서는, CSA 자료구조로부터 TSM 율이 얻어지며 (UIP/PC (200) 로부터의 사용자 TSM 율 출력을 교체하도록), TSM 모니터 (400) 의 입력으로 제공된다. (이하 상술되는) 실시예 (6000) 에 따라 CSA 자료구조로부터 TSM 율이 얻어진다, 즉, 실시예 (6000) 의 TSM 컨셉 룩업 (6500) 으로부터 TSM 율이 출력된다.The embodiment 4000 shown in FIG. 8 may be modified to convert a previously generated velocity contour for a particular audio or audio-visual process into a CSA data structure for that process. In this embodiment, the TSM rate is obtained from the velocity contours (to replace the user TSM rate value from the UIP / PC 4200) and provided as input to the TSM concept monitor 4400. From the illumination of the description herein, the method of inputting the velocity contours and obtaining the TSM rate is apparent to those skilled in the art. Equally, the embodiment 1000 shown in FIG. 1 may be modified to convert a previously generated CSA data structure for a particular audio or audio-visual process into a velocity contour for that process. In this variant, the TSM rate is obtained from the CSA data structure (to replace the user TSM rate output from UIP / PC 200) and provided as input to the TSM monitor 400. The TSM rate is obtained from the CSA data structure according to the embodiment 6000 (described below), that is, the TSM rate is output from the TSM concept lookup 6500 of the embodiment 6000.

이해의 용이를 위해, 상기 서술된 실시예는 TSM 율을 참조한다. 그러나 본 발명은 이에 한정되지 않는다. 본 발명의 실시예는 본 발명의 실시예를 제작 또는 실행하는데 사용을 위하여, 이로부터 TSM 율이 결정될 수 있는 어느 것 (상기에서는 친화도 정보) 이라도 사용할 수 있음이 이해될 것이다. 예컨대, TSM 율을 대신하여 사용자 관심도 또는 사용자 정보 검색 레벨의 표시가 사용될 수도 있을 것이다. 다음, 재생을 제공하기 위하여, 사용자 관심 또는 사용자 정보검색 레벨과 TSM 율 사이에 변환이 이루진다. 이러한 실시예에서는, 사용자 관심도 또는 사용자 정보 검색레벨을 TSM 율로 맵핑하는데 변환 함수가 사용될 수도 있을 것이다. 이러한 몇몇 실시예에서는, 예컨대 속도 등고선 또는 CSA 자료구조의 변화없이 변환 함수가 변형될 수도 있다.For ease of understanding, the embodiment described above refers to the TSM rate. However, the present invention is not limited thereto. It will be appreciated that embodiments of the present invention may use anything from which TSM rates can be determined (affinity information above) for use in making or implementing embodiments of the present invention. For example, an indication of user interest or user information retrieval level may be used in place of the TSM rate. Next, a transition is made between the user interest or user information retrieval level and the TSM rate to provide playback. In such an embodiment, a transform function may be used to map the user interest or user information retrieval level to the TSM rate. In some such embodiments, the transform function may be modified, for example, without changing the velocity contours or the CSA data structure.

이해의 용이를 위해, 상기의 실시예에서는 TSM 율 및 관련된 시간적 위치 사이의 대응을 이루는 속도 등고선, 및 TSM 율 및 관련된 컨셉 사이의 대응을 이루는 CSA 자료구조를 참조한다. 그러나, 본 발명은 이에 한정되지 않는다. 본 발명의 실시예는, TSM 율을 결정할 수 있도록 하는 어느 것과 TSM 율과 관련된 처리의 하나 이상의 부분들이 식별되도록 하는 어느 것 사이의 대응을 이루는 속도 등고선 또는 CSA 자료구조를 참조한다는 것이 이해될 것이다.For ease of understanding, the above embodiments refer to the velocity contours that make the correspondence between the TSM rate and the associated temporal position, and the CSA data structure that makes the correspondence between the TSM rate and the related concepts. However, the present invention is not limited to this. It will be appreciated that embodiments of the present invention refer to velocity contours or CSA data structures that make a correspondence between which enables to determine the TSM rate and which allows one or more portions of the process related to the TSM rate to be identified.

또한, 본 발명의 실시예에서는, 속도 등고선 또는 CSA 자료구조를 참조하며, TSM 율의 식별자 및 부분의 식별자가, 특정 부분 식별자에 사용되어야할 TSM 율을 결정하기 위한 함수적 의존관계를 가질수 있음이 이해될 것이다. 예컨대, 처리의 몇몇 부분을 식별하는 데 컨셉이 사용되는 실시예에서는, 컨셉이 처리에 나타난 횟수의 함수로 특정 컨셉과 관련된 TSM 율이 연산되어, 컨셉의 첫번재 재생이 느린 TSM 율을 사용하고, 후속하는 동일한 컨셉의 발생에서는 더 빠른 재생을 위하여 TSM 율이 증가될 수 있을 것이다.In addition, in embodiments of the present invention, reference is made to the velocity contours or the CSA data structure, wherein the identifier of the TSM rate and the identifier of the part may have a functional dependency to determine the TSM rate that should be used for a particular part identifier. Will be understood. For example, in an embodiment where a concept is used to identify some parts of a process, the TSM rate associated with the particular concept is computed as a function of the number of times the concept has appeared in the process, so that the first replay of the concept uses the slow TSM rate, In subsequent occurrences of the same concept the TSM rate may be increased for faster playback.

청취자 관심도 필터링 도구를 생성하는 속도 등고선 및 컨셉 속도관계 자료구조의 적용Application of Velocity Contour and Concept Velocity Relation Data Structures to Generate Listener Interest Filtering Tools

본 발명의 제 4 실시예에 따르면, LIF 도구를 생성하기 위하여 오디오 또는 오디어-비쥬얼 도구와 연계하여 속도 등고선이 사용되어 LIF 도구를 생성하는데, 속도 등고선에 의해 지정된 TSM 율 또는 재생 속도에 따라 오디오 또는 오디오-비쥬얼 처리의 세그먼트가 재생된다. 또한, 그러한 실시예에서는 동일한 실시예에에서의 또는 다른 재생 장치에서의 나중의 재생을 위해 LIF 도구를 저장한다.According to a fourth embodiment of the present invention, a speed contour is used in conjunction with an audio or audio-visual tool to generate a LIF tool to generate a LIF tool, wherein the audio is generated according to the TSM rate or playback speed specified by the speed contour. Or a segment of audio-visual processing is played. In addition, such an embodiment stores the LIF tool for later playback in the same embodiment or on another playback device.

기술분야의 당업자가 쉽게 이해할 수 있듯이, 오디오-비쥬얼 처리의 오디오 부분에 대하여 LIF 도구를 제공하는 본 발명의 실시예는 또한 오디오-비쥬얼 처리의 오디오에 매칭되도록 비쥬얼 정보를 가속 또는 감속시킬 수도 있다. 바람직한 실시예에서 이를 행하기 위하여, 전술한 바와 같이 기술분야의 당업자에게 공지된 방법에 따라 TSM 방법으로 오디오가 처리되고, 비디오 신호가 "프레임-서브샘플화" 또는 "프레임-복제화" 되어, 요구되는 TSM 율을 달성하고, 오디오-비쥬얼 처리의 오디오 및 비쥬얼 부분간의 동기화를 유지한다. 따라서, 오디오를 가속하하고 샘플이 좀 더 빠른 속도로 이루어지도록 요청된다면, 프레임 스트림이 서브샘플링 즉, 프레임이 건너뛰기 (skip) 된다.As will be readily appreciated by those skilled in the art, embodiments of the present invention that provide LIF tools for the audio portion of an audio-visual process may also accelerate or decelerate visual information to match the audio of the audio-visual process. To do this in the preferred embodiment, the audio is processed by the TSM method according to methods known to those skilled in the art as described above, and the video signal is " frame-subsampled " or " frame-duplicated " Achieve a TSM rate, and maintain synchronization between the audio and visual portions of the audio-visual process. Thus, if the audio is accelerated and samples are requested to be made at a higher rate, the frame stream is subsampled, i.e., the frames are skipped.

도 10 은 오디오 또는 오디오-비쥬얼 처리와 연계하여 속도 등고선을 활용하여 LIF 도구를 생성하는, 본 발명의 제 4 실시예의 실시예 (5000) 의 블록도를 나타낸다. 도 10 에 도시된 바와 같이, 실시예 (5000) 는 사용자로부터 입력을 수신하는 사용자 인터페이스 (UI, 5100) 를 구비한다. UI (5100) 는 도 1 에 대하여 전술한 UI (100) 와 동일하다. UI (5100) 는 사용자로부터의 입력을 지시하는 출력신호를 제공한다. 사용자 입력은 사용자 입력 처리기 (5200)/ 재생 제어기 (5200) (UIP/PC, 5200) 에 의해 인터프리트 되어 사용자에의해 선택되는 다음의 옵션을 나타낸다 : (a) 재생될 파일의 선택, 파일은 특정 오디오 또는 오디오-비쥬얼 처리에 해당한다 (선택된 파일은 실시예 (5000) 로 직접 입력될 수도 있으며, 실시예 (5000) 에 저장된 파일일 수도 있다) (b) TSM 율 또는 재생 속도를 제어하기 위한 속도 등고선을 선택, (c) 선택된 파일의 재생을 시작 (d) 선택된 파일의 재생을 홀트시킴 (e) 선택된 파일의 재생을 일시중지 (f) 재생되고 있는 오디오 또는 오디오-비쥬얼 도구의 일 부분에 대한 속도 등고선으로부터 얻어진 TSM 율 또는 재생속도를 변형 또는 오버라이드 (override), 또는 (g) 이하 설명되는 방법으로 장치에 의해 사용되는 파라미터 Offset 및 Override 를 지정.10 shows a block diagram of an embodiment 5000 of a fourth embodiment of the present invention, which utilizes velocity contours in conjunction with audio or audio-visual processing to generate a LIF tool. As shown in FIG. 10, an embodiment 5000 has a user interface (UI) 5100 for receiving input from a user. The UI 5100 is the same as the UI 100 described above with respect to FIG. 1. The UI 5100 provides an output signal indicating an input from a user. The user input is interpreted by the user input processor 5200 / playback controller 5200 (UIP / PC, 5200) to represent the following options selected by the user: (a) Selection of the file to be played, the file being specified Corresponds to audio or audio-visual processing (the selected file may be entered directly in embodiment 5000, or may be a file stored in embodiment 5000). (B) Rate for controlling the TSM rate or playback speed. Select contours, (c) start playback of the selected file, (d) pause playback of the selected file, (e) pause playback of the selected file, and (f) for a portion of the audio or audio-visual Modifies or overrides the TSM rate or playback rate obtained from the velocity contours, or (g) specifies the parameters Offset and Override used by the device in the manner described below.

도 10 에 도시된 바와 같이, UIP/PC (5200) 는 UI (5100) 로부터 사용자 입력을 수신하고, (a) 사용자 입력을 수치로 변환하고, (b) 사용자 입력을 인터프리트하여 파라미터 값을 설정하고, 속도 등고선의 사용, 변형 또는 오버라이딩을 제어하고, (c) 스트림 데이터 요청을 디지털 저장장치 (5075) 로 송출함으로써 오디오 또는 오디오-비주얼 도구로부터의 데이터 스트림을 액세스 및 로딩 하도록 지시하고 (재생 제어를 수행하도록), (d) 디지털 저장장치 (5075) 로 스트림 데이터 요청을 송출함으로써 속도 등고선으로부터의 데이터 스트림을 액세스 하고 로딩한다. 디지털 저장장치 (5075) 의 경우에 있어서는, UIP/PC (5200) 이 장치상의 파일 시스템에 저장된 오디오 또는 오디오-비쥬얼 처리를 나타내는 디지털 데이터 파일로 액세스 하도록 요청할 수도 있다. 오디오 또는 오디오-비쥬얼 처리로부터의 데이터 스트림의 액세스 및 로딩을 지시하기 위하여, UIP/PC (5200) 는 사용자 입력과 디지털 저장장치 (5075) 상에 저장된 오디오 또는 오디오-비쥬얼 처리를 나타내는 디지털 샘플의 위치를 인터프리트하여, 특정 샘플에서 선택된 파일에 대한 재생 위치를 연산한다. 바람직한 실시예에서는, 오디오 또는 오디오-비쥬얼 처리에대한 데이터 요청 및 속도 등고선에 대한 데이터 요청은, 각각의 동일한 시간적 위치로부터의 데이터가 디지털 저장장치 (5075) 로부터의 출력으로 제공되도록 발행될 수도 있다.As shown in FIG. 10, the UIP / PC 5200 receives a user input from the UI 5100, (a) converts the user input into a numerical value, and (b) interprets the user input to set a parameter value. Control the use, transformation, or overriding of velocity contours, and (c) direct and access data streams from audio or audio-visual tools by sending stream data requests to digital storage 5075 (playback). To perform control), and (d) access and load the data stream from the velocity contours by issuing a stream data request to digital storage 5075. In the case of digital storage 5075, the UIP / PC 5200 may request access to a digital data file representing audio or audio-visual processing stored in a file system on the device. In order to direct access and loading of the data stream from audio or audio-visual processing, the UIP / PC 5200 places a user sample and a digital sample representing the audio or audio-visual processing stored on the digital storage 5075. Interpret to compute the playback position for the selected file in a particular sample. In a preferred embodiment, a data request for audio or audio-visual processing and a data request for speed contours may be issued such that data from each same temporal location is provided to the output from digital storage 5075.

디지털 저장장치 (5075) 는 입력으로써 다음을 수신한다 : (a) UIP/PC (5200) 로부터의 스트림 데이터 요청, 및 선택적으로 (b) TSM 서브시스템 (5300) 으로부터의 TSM 된 출력. 디지털 저장장치 (5075) 는 출력으로 다음을 생성한다 : (a) 오디오 또는 오디오-비쥬얼 처리를 나타내는 데이터 스트림, (b) 예컨대 출력중인 데이터 스트림의 파일상의 위치 등의 위치 정보 스트림 (c) 속도 등고선을 나타내는 데이터 스트림. 예컨대, 하드 디스크 드라이브 등의 디지털 저장장치를 사용하여 범용 데이터를 저장하고 검색하는 많은 방법은 기술분야의 당업자에게 공지되어 있다.Digital storage 5075 receives as input: (a) a stream data request from UIP / PC 5200, and optionally (b) TSM output from TSM subsystem 5300. Digital storage 5075 generates as output: (a) a data stream representing audio or audio-visual processing, (b) a location information stream, such as a location on a file of the data stream being output, and (c) a speed contour. Data stream representing. For example, many methods of storing and retrieving general purpose data using digital storage devices such as hard disk drives are known to those skilled in the art.

오디오 또는 오디오-비쥬얼 처리는 전형적으로 디지털 형태로 디지털 저장장치 (5075) 상에 저장된다. 디지털 저장장치 (5075) 의 실시예는 도 1 에 대하여 전술한 디지털 저장장치 (75) 와 동일한 것이다. 디지털 저장장치 (5075) 는 기술분야의 공지된 방법에 따라 UIP/PC (5200) 에 의해 액세스 되어, 오디오 및/또는 오디오-비쥬얼 처리를 나타내는 디지털 샘플의 스트림을 제공한다. 대체 실시예에서는, 오디오 또는 오디오-비쥬얼 처리가 아날로그 형태로 아날로그 저장장치상에 저장된다. 그러한 대체 실시예에서는, 아날로그 샘플을 디지털 샘플로 변환하는, 도시되지 않은 장치로 아날로그 신호의 스트림이 입력된다. 음성신호와 같은 입력 아날로그 신호를 수신하고 적어도 나이키스트 율인 속도로 아날로그 신호를 샘플링하여, 충실도의 손실없이 아날로그 신호로 다시 변환될 수 있는 디지털 신호의 스트림을 제공하는, 많은 상용의 장치들이 있다. 다음, 디지털 샘플은 TSM 서브시스템 (5300) 으로 전송된다.Audio or audio-visual processing is typically stored on digital storage 5075 in digital form. The embodiment of the digital storage device 5075 is the same as the digital storage device 75 described above with respect to FIG. Digital storage 5075 is accessed by UIP / PC 5200 in accordance with methods known in the art to provide a stream of digital samples indicative of audio and / or audio-visual processing. In alternative embodiments, audio or audio-visual processing is stored on analog storage in analog form. In such alternative embodiments, a stream of analog signals is input to a device not shown, which converts analog samples into digital samples. There are many commercially available devices that receive an input analog signal, such as a voice signal, and sample the analog signal at a rate of at least Nyquist rate to provide a stream of digital signals that can be converted back to an analog signal without loss of fidelity. The digital sample is then sent to the TSM subsystem 5300.

TSM 율 결정기 (5400) 는 입력으로서 다음을 수신한다 : (a) 디지털 저장장치 (5075) 로부터의 입력으로서 적용되는 사용자에 의해 선택된 속도 등고선, (b) UIP/PC (5200) 으로부터의 입력으로서 적용되는 사용자에 의해 지정된 TSM 율, (c) UIP/PC (5200) 로부터의 입력으로서 적용되는 사용자에 의해 지정된 오프셋 TSM 율, Offset (d) UIP/PC (5200) 으로부터의 입력으로서 적용되는, 사용자에 의해 지정된 부울 (boolean) 파라미터, Override, 및 (e) 송출중인 샘플의 스트림상의 위치를 식별하는데 사용되는 디지털 저장장치 (5075) 로부터의 현재 스트림 위치정보, 예컨대, 디지털 저장장치 (5075) 로부터 전송된 샘플 그룹의 시작 시간값 또는 샘플 카운트. 응답하여, TSM 율 결정기 (5400) 는 출력으로서 TSM 서브시스템 (5300) 에 의해 수신된 TSM 율을 생성한다.The TSM rate determiner 5400 receives as input: (a) a speed contour selected by the user applied as input from digital storage 5075, (b) applied as input from UIP / PC 5200. TSM rate specified by the user to be applied, (c) Offset TSM rate specified by the user to be applied as input from UIP / PC 5200, Offset (d) to user, applied as input from UIP / PC 5200 Boolean parameters, overrides, and (e) current stream position information from digital storage 5075 used to identify the location on the stream of the sample being sent, e.g., transmitted from digital storage 5075. Start time value or sample count of the sample group. In response, the TSM rate determiner 5400 generates the TSM rate received by the TSM subsystem 5300 as an output.

TSM 율 결정기 (5400) 는, 속도 등고선에 의해 지정된 관련 TSM 율을 결정하기 위하여, 스트림 위치정보를 사용하여 속도 등고선내의 최근접의 해당 시간적 위치를 선택한다. 이러한 시도는 다른 Interval_Size 값 또는 다른 TSM 샘플링 주파수로 발생된 속도 등고선들이 어떠한 오디오 또는 오디오-비쥬얼 처리에도 적용될 수 있도록 하며, 데이터 스트림 위치 및 속도 등고선으로부터 얻어진 TSM 율 간의 일대일 시간 대응을 보장한다.The TSM rate determiner 5400 selects the corresponding temporal position nearest the speed contour using the stream position information to determine the relevant TSM rate specified by the speed contour. This approach allows velocity contours generated at different Interval_Size values or different TSM sampling frequencies to be applied to any audio or audio-visual processing, ensuring a one-to-one time correspondence between the data stream position and the TSM rate derived from the velocity contours.

TSM 율 결정기 (5400) 는 다음의 동작 모드중 하나를 사용하여, 출력 TSM 율또는 재생속도를 결정한다.The TSM rate determiner 5400 determines the output TSM rate or refresh rate using one of the following operating modes.

1. 속도 등고선 구동의 재생모드 : 이 모드에서는, TSM 율 결정기 (5400) 의 출력은 재생되어야할 입력 오디오 또는 오디오-비쥬얼 처리의 해당 부분들에 대한 속도 등고선으로부터 얻어진 TSM 율이다. 이 모드는 속도 등고선에 의해 지정된 것과 동일한 TSM 율을 출력한다.1. Playback mode of speed contour driving: In this mode, the output of the TSM rate determiner 5400 is the TSM rate obtained from the speed contours for the corresponding parts of the input audio or audio-visual processing to be reproduced. This mode outputs the same TSM rate as specified by the velocity contours.

2. 속도 등고선 오프셋 재생모드 : 이 모드에서는, 사용자는 UI (5100) 를 통하여, 속도 등고선에 의해 지정되는 TSM 율을 조정하는데 사용되는 오프셋 파라미터 Offset 을 지정한다. 이 모드에서는, TSM 율 출력이 다음의 공식으로 주어진다.2. Velocity Contour Offset Playback Mode: In this mode, the user specifies via the UI 5100 an offset parameter Offset that is used to adjust the TSM rate specified by the velocity contour. In this mode, the TSM rate output is given by the following formula.

TSM_rate = 속도 등고선으로부터의 TSM 율 * (1 + Offset)TSM_rate = TSM rate from the speed contour * (1 + Offset)

예컨대, 사용자가 -0.4 의 오프셋 인자를 지정한다면, TSM 율 결정기 (5400) 는 수치 1.0 으로 -0.4 의 오프셋 값을 가산하고, (그 결과 0.6) 속도 등고선에서 지정된 각각의 TSM 율로 척도변환 (scale) 하여, 생성되는 출력신호에 대하여 TSM 율 또는 재생속도의 균일한 감소 (감속) 을 달성할 것이다. 동일하게, 양의 오프셋은 생성되는 출력신호에 대하여, TSM 율 또는 재생속도를 증가 (가속) 시킬 것이다. 0 의 오프셋 값은 TSM 율로 아무런 효과도 가지지 않는다. 쉽게 이해할 수 있듯이, 서로 다른 오프셋 방법이 TSM 율의 비선형 및 선형 척도변환을 달성하기 위하여 채용될 수 있다.For example, if the user specifies an offset factor of -0.4, the TSM rate determiner 5400 adds an offset value of -0.4 to the number 1.0 (result 0.6) and scales to each TSM rate specified in the velocity contours. Thus, a uniform reduction (deceleration) of the TSM rate or the reproduction speed will be achieved with respect to the generated output signal. Equally, a positive offset will increase (accelerate) the TSM rate or refresh rate for the output signal generated. An offset value of zero has no effect at the TSM rate. As can be readily appreciated, different offset methods can be employed to achieve nonlinear and linear scaling of the TSM rate.

3. 속도 등고선의 사용자 오버라이드 모드: 이 모드에서는, 사용자가 속도 등고선을 오버라이드 (override) 할 수 있으며, 오디오 또는 오디오-비쥬얼 처리의부분들에 걸쳐 TSM 율 또는 재생속도를 수동으로 제어할 수 있다. 사용자에 의해 오버라이드가 배포되면, 출력신호의 TSM 율 또는 재생속도를 결정하는데 사용되는 TSM 율이 속도 등고선의 해당 위치로부터 추출된다.3. User Override Mode of Speed Contours: In this mode, the user can override the speed contours and manually control the TSM rate or playback speed across portions of the audio or audio-visual processing. When the override is distributed by the user, the TSM rate used to determine the TSM rate or playback speed of the output signal is extracted from the corresponding position of the speed contour.

도 10 에 도시된 바와 같이, TSM 서브 시스템 (5300) 은 입력으로 다음을 수신한다 : (a) 디지털 저장장치 (5075) 로부터의 오디오 또는 오디오-비쥬얼 처리의 부분들을 나타내는 샘플의 스트림, (b) 송출중인 샘플의 데이터 스트림내의 위치를 식별하는데 사용되는 디지털 저장장치 (5075) 로부터의 스트림 위치정보, 예컨대 샘플 카운트 또는 시간 값, 및 (c) TSM 율 결정기 (5400) 로부터의 TSM 율. 전술한 바와 같이, 입력은 기술분야의 당업자에게 공지된 방법 및 장치에 따라 일련의 디지털 샘플로 변환되는 아날로그 일 수 있다. TSM 서브시스템 (5300) 으로부터의 출력은 다음의 입력으로써 적용된다 : (a) 디지털 아날로그 변환기/오디오 및/또는 오디오-비쥬얼 재생장치 (5600, DA/APD), 선택적으로 (b) 요구된다면, TSM 율에서의 재생을 저장하는 디지털 저장장치 (5075). DA/APD (5600) 는 디지털 샘플을 수신하고, 오디오 또는 오디오-비쥬얼 처리의 재생을 제공하는 기술분야에서 공지된 장치이다. TSM 장치 (4300) 로부터의 출력은 입력 오디오 또는 오디오-비쥬얼 처리의 TSM 된 버젼인 디지털화된 오디오 또는 오디오-비쥬얼 스트림을 포함하는 디지털 샘플의 스트림이며, 본 발명에 따르면, 속도 등고선 및/또는 사용자 입력에 의해 지정되는 TSM 율 또는 재생 속도를 반영한다. 이 출력은 LIF 도구를 나타낸다.As shown in FIG. 10, the TSM subsystem 5300 receives as input: (a) a stream of samples representing portions of audio or audio-visual processing from digital storage 5075, (b) Stream location information, such as sample count or time value, from a digital storage device 5075 used to identify the location in the data stream of the sample being sent, and (c) the TSM rate from the TSM rate determiner 5400. As mentioned above, the input may be an analog that is converted into a series of digital samples according to methods and apparatus known to those skilled in the art. The output from the TSM subsystem 5300 is applied as the following inputs: (a) digital to analog converter / audio and / or audio-visual playback device 5600, DA / APD, optionally (b) if required, TSM Digital storage (5075) for storing playback at rate. DA / APD 5600 is a device known in the art for receiving digital samples and providing reproduction of audio or audio-visual processing. The output from the TSM device 4300 is a stream of digital samples comprising a digitized audio or audio-visual stream, which is a TSM version of the input audio or audio-visual processing, and in accordance with the present invention, velocity contours and / or user input. Reflects the TSM rate or playback speed specified by. This output shows the LIF tool.

몇몇 실시예에서는, LIF 도구가 동일한 실시예에 의해 또는 다른 재생 장치에 의해 나중의 재생을 위해 저장된다. 또한, 디지털 출력은 아날로그 장치상의 저장을 위하여 아날로그 형태로 변환될 수 있다. 16-비트 펄스코드변조와 같은 디지털화된 입력신호를 수신하고 이의 아날로그 신호를 출력하는 많은 장치가 기술분야의 당업자에게 공지되어 있다. 예컨대, 신호를 나타내는 디지털화된 샘플의 스트림을 수신하고, 이러한 샘플들을 충실도의 손실없이 아날로그 신호로 변환하는 상용의 장비가 있음은 기술분야의 당업자에게는 자명한 것이다. TSM 서브시스템 (5300) 및 DA/APD (5600) 의 실시예는 도 1 의 전술한 TSM 서브시스템 (300) 및 DA/APD (600) 와 동일한 것이다. 기술분야의 당업자라면 이해할 수 있는 바와 같이, 실시예 (5000) 가 오디오-비쥬얼 처리의 재생을 제공할 때마다, TSM 서브시스템 (5300) 은 오디오-비쥬얼 처리의 오디오에 매칭되도록 비쥬얼 정보를 가속 및 감속한다. 바람직한 실시예에서 이것을 수행하기 위해서, 기술분야의 공지된 많은 방법중 하나에 따라 "프레임-서브샘플화" 또는 "프레임-복제화" 되어, 오디오-비쥬얼 처리의 오디오와 비쥬얼 부분간의 동기화를 유지한다. 따라서, 오디오를 가속시키고, 더 빠른 속도의 샘플링이 요청된다면, 프레임 스트림이 서브 샘플된다, 즉 프레임은 건너뛰기 (skip) 된다.In some embodiments, the LIF tool is stored for later playback by the same embodiment or by another playback device. In addition, the digital output can be converted to analog form for storage on an analog device. Many devices are known to those skilled in the art for receiving digitized input signals, such as 16-bit pulse code modulation and outputting analog signals thereof. For example, it would be apparent to those skilled in the art that there are commercially available equipment that receives a stream of digitized samples representing a signal and converts these samples into an analog signal without loss of fidelity. The embodiment of the TSM subsystem 5300 and the DA / APD 5600 is the same as the TSM subsystem 300 and the DA / APD 600 described above in FIG. 1. As will be appreciated by those skilled in the art, whenever the embodiment 5000 provides for playback of audio-visual processing, the TSM subsystem 5300 accelerates and visualizes the visual information to match the audio of the audio-visual processing. Slow down. In order to accomplish this in the preferred embodiment, it is "frame-subsampled" or "frame-duplicated" according to one of many methods known in the art to maintain synchronization between the audio and visual parts of the audio-visual process. Thus, if the audio is accelerated and a faster rate of sampling is required, the frame stream is subsampled, ie the frame is skipped.

도 10 은 별도의 모듈로 구성되어야 하는 실시예 (5000) 를 나타내지만, 바람직한 실시예에서는 UI (5100), UIP/PC (5200), TSM 서브시스템 (5300), 및 TSM 율 결정기 (5400) 가 PC 와 같은 범용 컴퓨터상에서 수행되는 소프트웨어 프로그램 또는 모듈로 구현된다. 또한, 디지털 저장장치 (5075) 는 디스크 드라이브 또는 RAM 으로 구현되며, D/A 변환기 (5600) 는 PC 상의 사운드카드와 같은 범용컴퓨터의 전형적인 악세서리로 구현된다. 상세한 설명의 조명으로, 소프트웨어로 이러한 프로그램 또는 모듈을 구현하는 방법은 기술분야의 당업자에게는 자명할 것이다.10 illustrates an embodiment 5000 that must be configured as a separate module, but in a preferred embodiment, the UI 5100, UIP / PC 5200, TSM subsystem 5300, and TSM rate determiner 5400 are shown. It is implemented as a software program or module that runs on a general purpose computer such as a PC. In addition, the digital storage device 5075 is implemented as a disk drive or RAM, and the D / A converter 5600 is implemented as a typical accessory of a general purpose computer such as a sound card on a PC. In light of the detailed description, it will be apparent to those skilled in the art how to implement such a program or module in software.

쉽게 이해할 수 있듯이, 사용자 입력이 없으면, LIF 도구의 시간 척도는 속도 등고선에 의해 전부 결정된다. 또한, 입력신호의 데이터 패치 (fetch) 율은 속도 등고선에 의해 결정된다 : 가속을 위해서는 고속이 필요하고, 감속을 위해서는 저속이 필요하다. 속도 등고선은 입력신호에 시간적 대응관계를 가지므로, 속도 등고선에 대한 데이터 패치율 또는 독출 속도는 입력신호와 동일하다. 많은 실시예에 있어서, 독출 속도가 변동하는 장치 수를 줄이는 것이 바람직하다. 본 발명에 따르면, 독출 속도의 변동은 다음의 방법으로 제거될 수 있다.As can be easily understood, without user input, the time scale of the LIF tool is entirely determined by the velocity contours. Also, the data fetch rate of the input signal is determined by the velocity contours: high speed is required for acceleration and low speed is required for deceleration. Since the speed contour has a temporal correspondence with the input signal, the data patch rate or read speed for the speed contour is the same as the input signal. In many embodiments, it is desirable to reduce the number of devices with varying read speeds. According to the present invention, the variation in the read speed can be eliminated by the following method.

속도 등고선의 이전 값에 의해 지정되는 속도로 속도 등고선에 포함된 데이터가 독출될 것이다. 속도 등고선 자체를 사용하여 입력 속도 등고선의 시간척도변환을 수행함으로써, 새로운 속도 등고선이 얻어진다. 이러한 TSM 된 속도 등고선은 입력 신호로 원래 속도 등고선을 적용함으로써 생성된 출력신호와 시간적인 대응관계를 공유할 것이다. 수행된 TSM 에 무관하게 고정된 속도로 출력이 발생되기 때문에, TSM 변환된 속도 등고선 값은 고정된 속도로 액세스 될 것이다.The data contained in the speed contour will be read at the speed specified by the previous value of the speed contour. By performing the time scale transformation of the input velocity contours using the velocity contours themselves, a new velocity contour is obtained. These TSM speed contours will share a temporal correspondence with the output signal generated by applying the original speed contours as input signals. Since the output is generated at a fixed rate regardless of the TSM performed, the TSM converted speed contour values will be accessed at a fixed rate.

본 발명의 제 5 실시예에 따르면, CSA 자료 구조가 오디오 또는 오디오-비쥬얼 처리와 연계하여 활용되어, CSA 자료구조에 의해 지정되는 TSM 율 또는 재생속도에 따라 오디오 또는 오디오-비쥬얼 처리의 부분들이 재생되는, LIF 도구를 생성한다. 또한, 몇몇 그러한 실시예에서는 동일한 실시예에 의해 또는 다른 재생장치에 의해 나중에 재생할 수 있도록 LIF 도구를 저장한다.According to a fifth embodiment of the present invention, the CSA data structure is utilized in conjunction with audio or audio-visual processing such that portions of the audio or audio-visual processing are reproduced according to the TSM rate or playback speed specified by the CSA data structure. Create a LIF tool. In addition, some such embodiments store LIF tools for later playback by the same embodiment or by other playback devices.

도 11 은, 오디오 또는 오디오-비쥬얼 처리와 연계하여 CSA 자료구조를 활용하여 LIF 도구를 생성하는, 본 발명의 제 5 실시예의 블록도를 나타낸다. 도 11 에 도시된 바와 같이, 실시예 (6000) 은 사용자로부터 입력을 수신하는 사용자 인터페이스 (UI, 6100) 를 포함한다. UI (6100) 의 실시예는 도 1 에 대하여 전술한 UI (100) 와 동일한 것이다. UI (6100) 는 사용자로부터의 입력을 나타내는 출력신호를 제공한다. 사용자 입력은 사용자 입력 처리기 (6200) / 재생 제어기 (UIP/PC 6200) 에 의해 인터프리트 되어, 사용자에 의해 선택되는 다음의 옵션을 나타낸다 : (a) 재생할 파일을 선택, 파일은 특정 오디오 또는 오디오-비쥬얼 처리에 해당한다 (선택된 파일은 실시예 (6000) 로 직접 입력될 수도 있고, 실시예 (6000) 에 저장된 파일일 수도 있다) (b) TSM 율 또는 재생 속도를 제어하기 위해, CSA 자료구조를 선택한다 (c) 선택된 파일의 재생을 시작한다 (d) 선택된 파일의 재생을 홀트한다 (e) 선택된 파일의 재생을 일시중지한다 (f) 재생되고 있는 오디오 또는 오디오-비쥬얼 처리의 일 부분에 대하여 CSA 자료구조로부터 얻어진 TSM 율 또는 재생 속도를 변형 또는 오버라이드 한다, 또는 (g) 이하 서술되는 방법과 장치에 의해 사용되는 파라미터 Theta, Offset, Slew-Limit, 및 Override 를 지정한다. 또한, 실시예는 예컨대 TV 로부터 직접 입력되는 오디오 또는 오디오-비쥬얼 처리를 수신할 수도 있다. 이러한 경우, 오디오 부분은 아날로그 입력에 대한 전술한 방법으로 디지털 형태로 변환되며, 폐쇄형 캡션 정보가 있다면, 또한 기술분야에 공지된 많은 방법중 하나에 따라 적절한 디지털 형태로 변환될 수 있다.11 shows a block diagram of a fifth embodiment of the present invention for generating a LIF tool utilizing a CSA data structure in conjunction with audio or audio-visual processing. As shown in FIG. 11, an embodiment 6000 includes a user interface (UI) 6100 that receives input from a user. An embodiment of the UI 6100 is the same as the UI 100 described above with respect to FIG. 1. The UI 6100 provides an output signal indicative of an input from a user. The user input is interpreted by the user input processor 6200 / playback controller (UIP / PC 6200), indicating the following options selected by the user: (a) Selecting a file to play, the file being specified audio or audio- Corresponds to the visual processing (the selected file may be entered directly into the embodiment 6000, or may be a file stored in the embodiment 6000). (B) To control the TSM rate or playback speed, the CSA data structure may be used. (C) Start playback of the selected file (d) Hold playback of the selected file (e) Pause playback of the selected file (f) For a portion of the audio or audio-visual processing being played Modify or override the TSM rate or playback rate obtained from the CSA data structure, or (g) the parameters Theta, Offset, Slew-Limit, and Override used by the methods and devices described below. Designate. Embodiments may also receive audio or audio-visual processing, for example, input directly from the TV. In this case, the audio portion is converted into digital form by the above-described method for analog input, and if there is closed caption information, it can also be converted into an appropriate digital form according to one of many methods known in the art.

도 11 에 도시된 바와 같이, UIP/PC (6200) 는 UI (6100) 로부터 입력을 수신하며, (a) 사용자 입력을 수치로 변환하고, (b) 사용자 입력을 인터프리트 하여 파라미터 값을 설정하고, CSA 자료구조로부터 TSM 율의 사용, 변형, 또는 오버라이딩을 제어하고, (c) 디지털 저장장치 (5075) 로 스트림 데이터 요청을 송출함으로써 오디오 또는 오디오-비쥬얼 처리로부터의 데이터 스트림의 액세스 및 로딩을 지시한다 (재생 제어를 수행한다). 디지털 저장장치 (6075) 에 있어서는, UIP/PC (6200) 가 디바이스 상의 파일 시스템에 저장된 오디오 또는 오디오-비쥬얼 처리를 나타내는 디지털 데이터의 파일로 액세스 하도록 요청할 것이다. 오디오 또는 오디오-비쥬얼 처리로부터의 데이터 스트림을 액세스 및 로딩을 지시하기 위해서, UIP/PC (6200) 가 디지털 저장장치 (6075) 에 저장된 오디오 또는 오디오-비쥬얼 처리를 나타내는 디지털 샘플의 위치 및 사용자 입력을 인터프리트 하여, 특정의 샘플에서 선택된 파일에 대한 재생위치를 연산한다.As shown in FIG. 11, the UIP / PC 6200 receives input from the UI 6100, (a) converts user input into a numerical value, and (b) interprets the user input to set parameter values. Control access, loading, or overriding the TSM rate from the CSA data structure, and (c) sending the stream data request to the digital storage 5075 to facilitate access and loading of the data stream from audio or audio-visual processing. Instruct (perform playback control). For digital storage 6075, UIP / PC 6200 will request access to a file of digital data representing audio or audio-visual processing stored in a file system on the device. In order to direct access and loading of the data stream from the audio or audio-visual processing, the UIP / PC 6200 inputs the location and user input of the digital sample representing the audio or audio-visual processing stored in the digital storage 6075. By interpreting, the playback position for the selected file in a particular sample is calculated.

디지털 저장장치 (6075) 는 다음을 입력으로 수신한다 : (a) UIP/PC (5200) 로부터의 스트림 데이터 요청, 및 선택적으로 (b) TSM 서브시스템 (5300) 으로부터의 TSM 된 출력. 디지털 저장장치 (5075) 는 출력으로 다음을 생성한다 : (a) 오디오 또는 오디오-비쥬얼 처리를 나타내는 데이터 스트림, (b) 위치 정보의 스트림, 예컨대 출력중인 데이터 스트림의 파일상의 위치, 및 (c) CSA 자료구조를 나타내는 데이터의 스트림. 예컨대, 하드 디스크 드라이브와 같은 디지털 저장장치를 활용하여 범용 데이터를 저장하고 검색하는 많은 방법들이 기술분야에 공지되어 있다.Digital storage 6075 receives as input: (a) a stream data request from UIP / PC 5200, and optionally (b) a TSM output from TSM subsystem 5300. Digital storage 5075 produces the following: (a) a data stream representing audio or audio-visual processing, (b) a stream of location information, such as a location on a file of the output data stream, and (c) Stream of data representing CSA data structures. For example, many methods of storing and retrieving general purpose data utilizing digital storage devices such as hard disk drives are known in the art.

오디오 또는 오디오-비쥬얼 처리는 전형적으로 디지털 저장장치 (6075) 상에 디지털 형태로 저장된다. 디지털 저장장치 (6075) 의 실시예는 도 1 에 대하여 전술한 디지털 저장장치 (75) 와 동일한 것이다. 디지털 저장장치 (6075) 는 기술분야의 당업자에게 공지된 방법에 따라 UIP/PC (6200) 에 의해 액세스 되어, 오디오 및/또는 오디오-비쥬얼 처리를 나타내는 디지털 샘플의 스트림을 제공한다. 대체 실시예로서는, 오디오 또는 오디오-비쥬얼 처리가 아날로그 형태로 아날로그 저장장치상에 저장된다. 그러한 대체 실시예에 있어서는, 아날로그 신호의 스트림이 아날로그 샘플을 디지털 샘플로 변환하는 도시되지 않은 장치로 입력된다. 목소리 신호와 같은 입력 아날로그 신호를 수신하고, 아날로그 신호를 적어도 나이키스트 율로 샘플링하여 충실도의 손실없이 다시 아날로그 신호로 변환될 수 있는 디지털 신호의 스트림을 제공하는, 기술분야의 당업자에게 공지된 많은 상용 장치가 있다. 다음, 디지털 샘플은 TSM 서브시스템 (6300) 으로 전송된다.Audio or audio-visual processing is typically stored in digital form on digital storage 6075. The embodiment of the digital storage 6075 is the same as the digital storage 75 described above with respect to FIG. Digital storage 6075 is accessed by UIP / PC 6200 in accordance with methods known to those skilled in the art to provide a stream of digital samples indicative of audio and / or audio-visual processing. In an alternative embodiment, audio or audio-visual processing is stored on analog storage in analog form. In such alternative embodiments, a stream of analog signals is input to an unshown device that converts analog samples into digital samples. Many commercial devices known to those skilled in the art that receive input analog signals, such as voice signals, and provide a stream of digital signals that can be sampled at least at the Nyquist rate to be converted back to analog signals without loss of fidelity. There is. The digital sample is then sent to the TSM subsystem 6300.

컨셉 결정기 (4700) 는 입력으로서 특정 옵션에 따라 다른 데이트 셋트를 받아들인다. 옵션 1 에 따르면, 입력 데이터는, 예컨대, TSM 서브 시스템 (6300) 으로 공급되고 있는 입력 오디오 또는 오디오-비쥬얼 처리의 현재 세그먼트와 함께 저장된, 폐쇄형 캡션 데이터 또는 택스트형 주해와 같은, 택스트 또는 컨셉을 나타내는 데이터 스트림을 포함한다. 옵션 1 의 경우에는, 컨셉 결정기 (6700) 는 들어오는 택스트 또는 컨셉을 나타내는 데이터 스트림을 출력으로서 컨셉 디코더 (6800) 로 전달한다. 옵션 2 에 따르면, 입력 데이터는 다음을 포함한다 : (a)디지털 저장장치 (6075) 로부터의 오디오 또는 오디오-비쥬얼 처리의 부분들을 나타내는 샘플들의 스트림, 및 (b) 예컨대, 디지털 저장장치 (6075) 로부터 전송되는 샘플 그룹의 시작의 시간값 또는 샘플카운트 등의 전송중인 샘플들의 스트림의 위치를 식별하는데 사용되는, 디지털 저장장치 (6075) 로부터의 현재 스트림 위치정보. 옵션 2 의 경우, 컨셉 결정기 (6700) 는 출력으로서 TSM 서브시스템 (6300) 으로 공급되고 있는 오디오 또는 오디오-비쥬얼 처리의 현재 부분에 포함된 컨셉을 나타내는 데이터의 스트림을 제공한다. 말로 된 문장의 컨셉 및/또는 택스트화된 문장은 오디오 또는 오디오-비쥬얼 처리로부터 폐쇄형 캡션 정보를 추출함으로써 또는 입력 오디오 또는 오디오-비쥬얼 처리로부터 택스트의 스트림을 획득하기 위한 음성 인식 알고리즘을 사용하여, 결정된다. 폐쇄형 캡션 정보를 추출하는 방법 및 음성 인식 알고리즘을 사용하여 택스트를 추출하는 많은 방법들은 기술분야의 당업자에게 공지되어 있다.Concept determiner 4700 accepts different data sets according to specific options as input. According to option 1, the input data may contain text or concepts, such as closed caption data or textual annotations, stored with the current segment of input audio or audio-visual processing being fed to the TSM subsystem 6300. Contains the indicated data stream. In the case of option 1, concept determiner 6700 passes an incoming text or a data stream representing the concept as output to concept decoder 6800. According to option 2, the input data includes: (a) a stream of samples representing portions of audio or audio-visual processing from digital storage 6075, and (b) digital storage 6075, for example. Current stream position information from digital storage 6075, used to identify the location of the stream of samples being transmitted, such as the time value of the start of a group of samples transmitted from or a sample count. For option 2, concept determiner 6700 provides as output the stream of data representing the concept included in the current portion of the audio or audio-visual processing being fed to TSM subsystem 6300. The concept of spoken sentences and / or texturized sentences may be extracted using closed caption information from audio or audio-visual processing, or using a speech recognition algorithm to obtain a stream of text from the input audio or audio-visual processing, Is determined. Many methods for extracting closed caption information and many methods for extracting text using a speech recognition algorithm are known to those skilled in the art.

캡션 정보 디코더 (6800) 는 입력으로서 컨셉 결정기 (6700) 로부터 컨셉 정보를 나타내는 데이터 스트림을 받아들인다. 본 발명에 따르면, 제한은 없지만, 컨셉 정보는 다음을 포함한다 : 받아쓰기된 문장, 실제의 택스트, 키워드, 어구(phrases), 또는 기술분야의 당업자에게 공지된 컨셉 정보의 다른 표현. 응답하여, 컨셉 정보 디코더 (6800) 는 출력으로서 TSM 서브시스템 (6300) 으로 송출중인 입력 오디오 또는 오디오-비쥬얼 처리의 현재 부분에 대한 키워드 및 컨셉을 나타내는 데이터 스트림을 발생시킨다.Caption information decoder 6800 accepts as input the data stream representing the concept information from concept determiner 6700. According to the present invention, without limitation, concept information includes: dictated sentences, actual text, keywords, phrases, or other representations of concept information known to those skilled in the art. In response, concept information decoder 6800 generates as output the data stream representing keywords and concepts for the current portion of the input audio or audio-visual processing being sent to TSM subsystem 6300.

컨셉 정보 디코더 (6800) 는 입력을 처리하여 입력 데이터 스트림의 컨셉 데이터 표현을 형성시킨다. 예컨대, 컨셉 정보 디코더 (6800) 는 단순히 택스트화된 입력으로부터 형용사나 관사를 제거하여 명사와 명사구만으로 구성된 출력을 제공할 수도 있다. 대체예로서, 컨셉 정보 디코더 (6800) 는 자연 언어 처리를 채용하여 말해진 단어의 스트림으로부터 컨셉 내용을 추출할 수도 있다. 컨셉 정보 디코더 (4800) 를 구현하는 많은 방법은 기술분야의 당업자에게 공지되어 있다.Concept information decoder 6800 processes the input to form a concept data representation of the input data stream. For example, concept information decoder 6800 may simply remove adjectives or articles from the texturized input to provide an output consisting of only nouns and noun phrases. Alternatively, concept information decoder 6800 may employ natural language processing to extract concept content from a stream of spoken words. Many methods for implementing the concept information decoder 4800 are known to those skilled in the art.

TSM 컨셉 룩업 (6500) 은 입력으로서 다음을 받아들인다 : (a) 디지털 저장장치 (6075) 로부터 수신된 CSA 자료구조, (b) TSM 서브시스템 (6300) 으로 송출중인 입력 오디오 또는 오디오-비쥬얼 처리의 현재 부분에 대한 컨셉을 나타내는 컨셉 정보 디코더 (6800) 로부터의 데이터, 및 (c) UIP/PC (6200) 로부터의 파라미터 Theta. TSM 컨셉 룩업 (6500) 은 데이터베이스 또는 스크래치-패드 메모리를 사용하여 각 레코드가 TSM 율에 대한 정보 및 TSM 율에 대한 컨셉 정보를 저장하게 되는, 레코드의 리스트를 유지관리한다. TSM 컨셉 룩업 (6500) 은 기술분야의 당업자에게 공지된 많은 방법중 하나에 따라 다음의 단계를 수행한다. 가장 근접하게 매칭하는 컨셉 엔트리에 대한 CSA 자료구조를 담고 있는 데이터베이스를 검색한다. 가장 근접하게 매칭하는 엔트리의 차이가 파라미터 Theta 에 의해 지정된 범위내라면, 그 엔트리에 관련된 TSM 율이 출력으로서 제공된다. 아무런 CSA 자료구조를 포함하는 데이터베이스내의 컨셉 엔트리도 파라미터 Theta 에 의해 지정된 범위내가 아니라면, 앞에서 획득된 TSM 율이 출력으로서 제공되어, TSM 율 조정기 (arbiter) (6400) 에 의해 수신된다.The TSM concept lookup 6500 accepts as input: (a) a CSA data structure received from digital storage 6075, (b) input audio or audio-visual processing being sent to the TSM subsystem 6300. Data from concept information decoder 6800 that represents the concept for the current portion, and (c) parameter Theta. The TSM concept lookup 6500 maintains a list of records, using a database or scratch-pad memory, where each record stores information about the TSM rate and concept information about the TSM rate. The TSM concept lookup 6500 performs the following steps in accordance with one of many methods known to those skilled in the art. Search the database containing the CSA data structure for the closest matching concept entry. If the difference of the closest matching entry is within the range specified by the parameter Theta, then the TSM rate associated with that entry is provided as output. If a concept entry in the database containing no CSA data structure is not within the range specified by the parameter Theta, then the TSM rate obtained previously is provided as output and received by the TSM rate arbiter 6400.

TSM 율 조정기 (6400) 는 입력으로서 다음을 수신한다 : (a) 사용자에 의해 지정되는 사용자 입력 처리기 (6200) 로부터의 TSM 율, (b) TSM 컨셉 푹업 (6500) 으로부터의 TSM 율, (c) 이하 상술되는, UIP/PC (6200) 으로부터의 파리미터 Offset, Slew_Limit, Override. 응답하여, TSM 율 조정기 (6400) 은 출력으로 TSM 서브시스템 (6300) 으로 전송되는 단일의 TSM 율을 생성한다.The TSM rate regulator 6400 receives as input: (a) the TSM rate from the user input processor 6200 specified by the user, (b) the TSM rate from the TSM concept lookup 6500, (c) Parameters Offset, Slew_Limit, Override from UIP / PC 6200, described below. In response, the TSM rate adjuster 6400 generates a single TSM rate that is sent to the TSM subsystem 6300 as an output.

TSM 율 조정기 (6400) 는 다음 동작모드중 하나를 사용하여 TSM 율, 또는 재생 속도를 결정한다.The TSM rate regulator 6400 determines the TSM rate, or playback speed, using one of the following modes of operation.

1. CSA 자료구조 구동의 재생 모드 : 이 모드에서는, 사용되는 TSM 율은 TSM 컨셉 룩업 (6500) 에 의해 제공되는 TSM 율이다.1. Playback Mode of CSA Data Structure Driven: In this mode, the TSM rate used is the TSM rate provided by the TSM Concept Lookup 6500.

2. CSA 자료구조 오프셋 재생 모드 : 이 모드에서는, 사용자는 UI (6100) 를 통하여, CSA 자료구조에서 지정된 TSM 율을 조정하는데 사용되는 오프셋 파라미터인 Offset 을 지정한다. TSM 율 출력이 다음의 공식으로 주어진다.2. CSA Data Structure Offset Playback Mode: In this mode, the user specifies via the UI 6100 an offset parameter, which is an offset parameter used to adjust the TSM rate specified in the CSA data structure. The TSM rate output is given by the formula

TSM_rate = TSM 컨셉 룩업으로부터의 TSM 율 * (1 + Offset)TSM_rate = TSM rate from TSM concept lookup * (1 + Offset)

예컨대, 사용자가 -0.4 의 오프셋을 지정한다면, TSM 율 조정기 (6400) 는 수치 1 로 -0.4 의 오프셋 값을 가산하고, (그 결과값 0.6) TSM 컨셉 룩업에 의해 지정된 각각의 TSM 율로 척도변환 (scale) 하여, 생성되는 출력신호에 대하여 TSM 율 또는 재생속도의 균일한 감소 (감속) 을 달성할 것이다. 동일하게, 양의 오프셋은 생성되는 출력신호에 대한 TSM 율 또는 재생속도를 증가 (가속) 시킬 것이다. 0 의 오프셋 값은 TSM 율에 아무런 효과도 가지지 않는다. 쉽게 이해할 수 있듯이, 다른 오프셋 방법이 TSM 율의 비선형 및 선형 척도변환을 달성하기위하여 채용될 수 있다.For example, if the user specifies an offset of -0.4, the TSM rate adjuster 6400 adds an offset value of -0.4 to the number 1 (result 0.6) and scales each of the TSM rates specified by the TSM concept lookup. to achieve a uniform reduction (deceleration) in the TSM rate or playback speed for the resulting output signal. Equally, a positive offset will increase (accelerate) the TSM rate or refresh rate for the output signal generated. An offset value of zero has no effect on the TSM rate. As can be readily appreciated, other offset methods can be employed to achieve nonlinear and linear scaling of the TSM rate.

3. CSA 자료구조의 사용자 오버라이드 모드: 이 모드에서는, 사용자가 TSM 컨셉 룩업 (6500) 으로부터 얻어진 TSM 율을 오버라이드할 수 있으며, 오디오 또는 오디어-비쥬얼 도구의 부분들에 대한 TSM 율 또는 재생속도를 수동으로 제어할 수 있다. 사용자에 의해 오버라이드가 배포될 때, 출력신호의 TSM 율 또는 재생속도를 결정하는데 사용되는 TSM 율이 TSM 컨셉 룩업 (6500) 으로부터 취해지는데, 오디오 또는 오디오-비쥬얼 처리의 현재 세그먼트내의 컨셉 정보에 해당하는 CSA 자료구조 엔트리를 활용하게 된다.3. User Override Mode for CSA Data Structures: In this mode, the user can override the TSM rate obtained from the TSM Concept Lookup (6500) and set the TSM rate or playback rate for the parts of the audio or audio-visual tool. You can control it manually. When the override is distributed by the user, the TSM rate used to determine the TSM rate or playback rate of the output signal is taken from the TSM concept lookup 6500, which corresponds to the concept information in the current segment of audio or audio-visual processing. Leverage CSA data structure entries.

TSM 율 조정기 (6400) 는 다른 TSM 율 사이의 전환을 부드럽게 하기 위하여, 사용자에 의해 지정된 슬루-율 (slew-rate) 파라미터를 사용하여 그 출력에서의 TSM 율의 변화율을 제한시킨다. TSM 조정기 (6400) 는 입력 스트림의 앞을 스캐닝하여 재생중인 오디오 또는 오디오-비쥬얼 처리에 대한 적절한 변화율을 예측할 수도 있다. 이러한 방법에서는, TSM 율의 변화와 관련된 시간 지체 (time-lag) 가 이하와 같이 감소될 수 있다.The TSM rate regulator 6400 limits the rate of change of the TSM rate at its output using a slew-rate parameter specified by the user to smooth the transition between different TSM rates. The TSM coordinator 6400 may scan the front of the input stream to predict an appropriate rate of change for the audio or audio-visual processing being played. In this method, the time-lag associated with the change in the TSM rate can be reduced as follows.

쉽게 이해할 수 있듯이, TSM 컨셉 룩업 (6500) 에서 출력되는 TSM 율 또는 재생 속도는 빠르게 변동할 수 있다. 재생 속도의 변화율을 제어하기 위하여 입력 파라미터 Slew_Limit 이 사용된다. Slew_Limit 은 TSM 율의 어느 변화 크기라도 파라미터 Slew_Limit 내에 지정된 양 이하가 되도록 함으로써, 재생속도가 점진적으로 변화하도록 하여, TSM 율 또는 재생속도의 큰 변화를 필터링한다. 그러나, 작은 Slew_Limit 값이 선택되는 경우에는, 새로운 TSM 율 또는 재생속도의변화에 요구되는 시간량이 길어진다는 것이 중요하다. 이것은 재생속도 응답이 둔해지도록하는 원치않는 부작용을 가질 수 있다. 예컨대, 입력이 보통 속도의 두배로 재생되고 있고, 관심의 항목이 매칭되었을 때를 고려하면, TSM 컨셉 룩업 (6500) 이 보통 속도의 반의 재생 속도 또는 TSM 율을 출력하게 된다. 이러한 경우, 입력 파라미터 Slew_Limit 은 그런 긴 변환 시간을 부과함으로서 관심의 단어가 CSA 자료구조 엔트리로부터 결정된 속도로 재생되지 않을 것이다. 이러한 원치않는 부작용을 피하는 한가지 방법은, TSM 컨셉 룩업 (6500) 이 오디오 또는 오디오-비쥬얼 입력 스트림 앞에서 스캐닝하고, TSM 율 또는 재생속도의 미래값을 획득하여, 오디오 또는 오디오-비쥬얼 처리의 다가오는 섹션에 대한 타겟 TSM 율을 결정하는데 사용되도록 하는 것이다. 다가오는 세그먼트에 대한 타겟 TSM 율이 Slew_Limit 이 TSM 율이 충분히 빠르게 조정되지 않도록 하는 TSM 율과 다를 때는, 지정된 미래의 TSM 율에 대한 방향으로 현재 세그먼트에 대한 TSM 율을 조정함으로써 TSM 율 조정기 (6400) 가 TSM 율 또는 재생 속도의 변환을 더 빨리 시작하도록 할 수 있을 것이다. 작은 Slew_Limit 값으로 인한 긴 변환 시간의 원치않는 부작용을 피하는 또 다른 방법은 TSM 컨셉 룩업 (6500) 이 미리 독출했을 양과 동일한 고정된 양으로 버퍼링함으로써 오디오 또는 오디오-비쥬얼 입력 스트림을 지연시키는 것이다. 이러한 방법은 오디오 또는 오디오 비쥬얼 입력 스트림내의 TSM 율 변환을 다소 더 일찍 천이시켜, 컨셉이 TSM 컨셉 룩업 (6500) 으로부터 지정된 속도로 재생되도록, 출력 스트림의 속도 변화가 충분히 빨리 발생하도록 하여 속도 변환이 Slew_Limit 에 고착하도록 하는, 출력 스트림을 가져온다.As can be easily understood, the TSM rate or playback speed output from the TSM concept lookup 6500 can vary rapidly. The input parameter Slew_Limit is used to control the rate of change of the playback speed. Slew_Limit filters the large change in TSM rate or playback speed by allowing the playback speed to change gradually by bringing any magnitude of change in the TSM rate below the amount specified in the parameter Slew_Limit. However, if a small Slew_Limit value is selected, it is important that the amount of time required for a change in the new TSM rate or regeneration rate is long. This can have unwanted side effects that slow down the speed response. For example, taking into account when the input is being reproduced at twice the normal rate and the item of interest is matched, the TSM concept lookup 6500 will output half the normal rate or the TSM rate. In this case, the input parameter Slew_Limit imposes such a long conversion time so that the word of interest will not be played back at the rate determined from the CSA data structure entry. One way to avoid these unwanted side effects is that the TSM concept lookup 6500 scans in front of the audio or audio-visual input stream, obtains a future value of the TSM rate or refresh rate, so that upcoming sections of audio or audio-visual processing are present. To be used to determine the target TSM rate. When the target TSM rate for an upcoming segment is different from the TSM rate that Slew_Limit does not allow the TSM rate to adjust quickly enough, the TSM rate adjuster (6400) is adjusted by adjusting the TSM rate for the current segment in the direction of the specified future TSM rate. You may be able to start converting the TSM rate or playback speed faster. Another way to avoid the unwanted side effects of long conversion times due to small Slew_Limit values is to delay the audio or audio-visual input stream by buffering it to a fixed amount equal to the amount that the TSM concept lookup 6500 would have read ahead. This method transitions the TSM rate conversion in the audio or audio visual input stream somewhat earlier, causing the speed change of the output stream to occur fast enough so that the concept is played back from the TSM concept lookup 6500 at a specified rate so that the rate conversion is Slew_Limit. Get an output stream to stick to.

도 12 는 TSM 율 또는 재생 속도를 제공하기 위하여, TSM 율 조정기 (6400) 의 일실시예에 사용된 알고리즘의 플로차트를 나타낸다.12 shows a flowchart of an algorithm used in one embodiment of a TSM rate adjuster 6400 to provide a TSM rate or playback rate.

도 12 에 도시된 바와 같이, 입력으로서 다음이 박스 (7105) 로 적용된다 (a) UIP/PC (6200) 로부터 수신되는 사용자에 의해 지정된 TSM 율 (TSM_USER), (b) TSM 컨셉 룩업 (6500) 에 의해 출력되는 TSM 율 (TSM_LUS), (c) UIP/PC (6200) 로부터 수신되는 사용자 의해 지정된 슬루 한계 파라미터 (Slew_Limit) (d) UIP/PC (6200) 로부터 수신되는 사용자에 의해 지정된 오버라이드 플래그 (Override), 및 (e) UIP/PC (6200) 로부터 수신되는 사용자에 의해 지정된 오프셋 값 (Offset).As shown in FIG. 12, the following applies as input to box 7105: (a) TSM rate (TSM_USER) specified by the user received from UIP / PC 6200, (b) TSM concept lookup 6500 TSM rate output by TSM_LUS, (c) User specified slew limit parameter received from UIP / PC 6200 (Slew_Limit) (d) Override flag specified by user received from UIP / PC 6200 ( Override), and (e) an offset value specified by the user as received from the UIP / PC 6200.

박스 (7105) 에서는, 오버라이드가 참인지 결정이 이루어진다. 그렇다면, 박스 (7900) 으로 넘어가고, 그렇지 않다면 박스 (7200) 으로 넘어간다. 박스 (7200) 에서는 오프셋이 0.0 인지 결정이 이루어진다. 그렇다면, 박스 (7300) 으로 넘어가고, 그렇지 않다면, 박스 (7110) 으로 넘어간다.At box 7105, a determination is made whether the override is true. If so, then go to box 7900, otherwise go to box 7200. At box 7200 a determination is made whether the offset is 0.0. If so, then move to box 7300; otherwise, go to box 7110.

박스 (7300) 에서는, 다음의 변수가 연산된다 : Delta = │TSM_Prev - TSM_LUS│및 Sign = sign[TSM_Prev - TSM_LUS], 여기서 TSM_Prev 는 앞서 결정된 TSM 율이다. 다음, 박스 (7400) 로 넘어간다.In box 7300, the following variables are computed: Delta = TSM_Prev-TSM_LUS and Sign = sign [TSM_Prev-TSM_LUS], where TSM_Prev is the TSM rate determined previously. Next, go to box 7400.

박스 (7400) 에서는, Delta 와 Slew_Limit 간의 비교에 기초하여 결정이 이루어진다. Delta 가 Slew_Limit 보다 크다면 박스 (7500) 으로 넘어가고, 그렇지 않다면, 박스 (7600) 으로 넘어간다.At box 7400, a determination is made based on the comparison between Delta and Slew_Limit. If Delta is greater than Slew_Limit, go to box 7500, otherwise go to box 7600.

박스 (7600) 에서는, Delta 가 Sing*Delta 와 같도록 설정되며, 다음 박스(7700) 으로 넘어간다. 박스 (7500) 에서는, Delta 가 Sign*Slew_Limit 와 같도록 설정되고, 다음 박스 (7700) 으로 넘어간다. 박스 (7700) 에서는, TSM_Prev 가 TSM_Prev + Delta 와 같도록 설정되며, 다음 박스 (7800) 로 넘어간다. 박스 (7800) 에서는, TSM_Prev 가 TSM 과 같도록 설정되며, TSM 값은 출력으로 제공된다.In box 7600, Delta is set equal to Sing * Delta and proceeds to the next box 7700. In box 7500, Delta is set equal to Sign * Slew_Limit and proceeds to the next box 7700. In box 7700, TSM_Prev is set equal to TSM_Prev + Delta, and the flow proceeds to the next box 7800. At box 7800, TSM_Prev is set equal to TSM, and the TSM value is provided as an output.

박스 (7900) 에서는, TSM 은 TSM_User 과 같도록 설정되고, 다음 박스 (7800) 로 넘어간다. 마지막으로, 박스 (7110) 에서는, TSM 은 TSM_LUS * (1 + Offset) 과 같도록 설정된다.At box 7900, the TSM is set equal to TSM_User and proceeds to the next box 7800. Finally, in box 7110, the TSM is set to equal TSM_LUS * (1 + Offset).

전술한 동작 모드의 조합은 또한 본 발명의 범위내이다. 예컨대, 사용자는 재생되어야할 오디오 또는 오디오-비쥬얼 처리에 임베드된 (imbeded) 폐쇄형 캡션 정보에 대하여 CSA 자료구조를 사용하여 사용자 오프셋을 조합하여 출력신호에 요구되는 TSM 율을 결정할 수 있다.Combinations of the foregoing modes of operation are also within the scope of the present invention. For example, a user may use a CSA data structure to combine user offsets for closed caption information embedded in audio or audio-visual processing to be played back to determine the TSM rate required for the output signal.

실시예 (6000) 로부터의 출력은, 입력 오디오 또는 오디오 비쥬얼 도구의 TSM 변환인, 디지털화된 오디오 또는 오디오-비쥬얼 스트림을 구성하는 디지털 샘플의 스트림이며, 본 발명에 따르면, CSA 자료구조 및/또는 사용자 입력에 의해 지정되는 TSM 율 또는 재생 속도를 반영한다. 이러한 출력은 LIF 도구를 나타낸다.The output from embodiment 6000 is a stream of digital samples that make up a digitized audio or audio-visual stream, which is a TSM transformation of an input audio or audio visual tool, and, according to the present invention, a CSA data structure and / or user Reflects the TSM rate or playback speed specified by the input. This output represents the LIF tool.

몇몇 실시예에서는, 또한 동일한 실시예에 의해서 또는 다른 재생 장치에 의해서 나중의 재생을 위하여, 실시예 (6000) 가 LIF 도구를 저장한다. 또한, 디지털 출력이 아날로그 장치상의 저장을 위하여 아날로그 형태로 변환될 수도 있다.16 비트 펄스코드변조와 같은 디지털화된 입력신호를 수신하고, 이로부터 아날로그 신호 출력을 제공하는 많은 장치들이 기술분야의 당업자에게 공지되어 있다. 예컨대, 신호를 나타내는 디지털화된 샘플의 스트림을 수신하고 이러한 샘플들을 충실도의 손실없이 아날로그 신호로 변환하는 많은 상용의 장비가 기술분야의 당업자에게 공지되어 있다. 종래 기술의 당업자가 이해할 수 있는 바와 같이, 실시예 (600) 가 오디오-비쥬얼 처리에 대한 재생을 제공할 때마다, TSM 서브시스템 (6300) 은 비쥬얼 정보를 가속 또는 감속하여 오디오를 오디오-비쥬얼 처리와 매칭시킨다. 바람직한 실시예에서 이러한 것을 수행하기 위하여, 비디오 신호는 공술분야의 당업자에게 공지된 많은 방법중 하나에 따라 "프레임-서브샘블화" 또는 "프레임-복제화" 되어, 오디오-비쥬얼 처리의 오디오와 비쥬얼 부분간의 동기화를 유지한다. 따라서, 오디오를 가속시켜 빠른 속도의 샘플링이 요구된다면, 프레임 스트림이 서브샘플되고, 즉, 프레임이 건너뛰기 (skip) 된다.In some embodiments, embodiment 6000 also stores the LIF tool for later playback by the same embodiment or by another playback device. In addition, the digital output may be converted into an analog form for storage on analog devices. Many devices that receive digitized input signals, such as 16-bit pulse code modulation, and provide analog signal outputs from those skilled in the art Known. For example, many commercial equipment are known to those skilled in the art that receive a stream of digitized samples representing a signal and convert these samples into analog signals without loss of fidelity. As will be appreciated by those skilled in the art, whenever the embodiment 600 provides playback for audio-visual processing, the TSM subsystem 6300 accelerates or decelerates the visual information to audio-visualize the audio. Match with. In order to accomplish this in a preferred embodiment, the video signal is “frame-subsampled” or “frame-replicated” according to one of many methods known to those skilled in the art, so that the audio and visual portions of the audio-visual process Keep in sync. Thus, if high speed sampling is required by accelerating audio, the frame stream is subsampled, i.e., the frame is skipped.

도 11 은 별도의 모듈이 구비되어야하는 실시예 (6000) 를 나타내지만, UI (6100), UIP/PC (6200), TSM 서브시스템 (6300), TSM 율 조정기 (6400), TSM 컨셉 룩업 (6500), 컨셉 결정기 (6700), 및 컨셉 정보 디코더 (6800) 는 PC 와 같은 범용 컴퓨터상에서 수행되는 소프트웨어 프로그램이나 모듈로 구현된다. 또한, 디지털 저장장치 (6075) 는 디스크 드라이브 또는 RAM 으로 구현되고, 디지털 아날로그 변환기는 PC 상의 사운드카드와 같은 범용 컴퓨터의 전형적인 악세서리로 구현된다. 소프트웨어로 이러한 프로그램 또는 모듈을 구현하는 방법은 상세한 설명의 조명에서 기술분야의 당업자가 이해할 수 있을 것이다.11 illustrates an embodiment 6000 in which a separate module should be provided, but with a UI 6100, a UIP / PC 6200, a TSM subsystem 6300, a TSM rate regulator 6400, a TSM concept lookup 6500. ), Concept determiner 6700, and concept information decoder 6800 are implemented as software programs or modules executed on a general purpose computer such as a PC. In addition, digital storage 6075 is implemented as a disk drive or RAM, and a digital analog converter is implemented as a typical accessory of a general purpose computer such as a sound card on a PC. How to implement such a program or module in software will be appreciated by those skilled in the art in light of the detailed description.

이하, 본 발명의 방법과 장치의 사용예를 설명한다. 본 발명의 방법과 장치의 제 1 사용예는 오디오-비주얼 도구를 사용한 교육과 관련된다. 본 발명의 장치는 특정 오디오-비쥬얼 처리의 TSM 율 또는 재생속도가 개인 사용자 기반으로 또는 특정 청취자 그룹을 타겟으로 한 공통 기반으로 제어되도록 한다. 예를 들어, 교육용 오디오-비쥬얼 처리가 시청자로 하여금, 특정 형태의 금융 트랜잭션을 엔터 및 리포트하도록 특정 운영체제상의 오더 엔트리 회계시스템 (orde entry accounting system) 를 셋업하고 사용하는 방법을 자세히 설명하는데 사용된다고 가정해 보기로 한다. 또한, 교육용 오디오-비쥬얼 처리의 타겟 청중은 이하 두 그룹으로 구성된다고 가정한다 : (a) 초보 컴퓨터 사용자인 회계사 (b) 표준 회계 업무에 익숙하지 않은 고급 컴퓨터 사용자. 오디오-비주얼 도구의 재생동안, 다음의 방식으로 자료 (material) 가 주어진다. "풀-다운 메뉴를 선택하고 NEW 를 엔터하시오" 와 같은 소프트웨어 프로그램의 사용자 인터페이스의 적절한 조작을 따라서 특정 금융 트랜잭션이 기술되고, 그 후 실제 프로세스의 시연 (demonstration) 이 수행된다. 이러한 오디오-비쥬얼 처리가 일반 속도로 재생되는 동안, 초보 컴퓨터사용자인 회계 전문가는 이미 정통한 금융 트랜잭션의 설명에 대해서는 짜증이 나게 되지만, 이와 같은 사람들은 이러한 인터페이스의 사용에 익숙하지 못한 사용자들이기 때문에, 소프트웨어의 엔트리가 수행되는 방법의 시연중에는 시연이 너무 빠르다고 생각할 수도 있다. 고급 컴퓨터 사용자이지만 초보 회계사인 사용자에게도 비슷하게, 특정 금융 트랜잭션의 설명동안의 교육속도 (말하는 속도) 가 무척 버거울 것이지만, 이와 같은 사람들도 앞에서 말로 기술된 엔트리 프로세스의 느린 방법적인 시연에 있어서는 짜증나게 될 것이다. 본 발명의 실시예는 다음과 같은 방법으로 이 문제를 해결한다. 오디오-비주얼 처리에 두 개의 속도 등고선이 탑재된다. 한 속도 등고선은 초보 회계사인 고급 컴퓨터 사용자를 위한 것 (FastCompSlowAcc.spdcon) 이고, 다른 속도 등고선은 초보 컴퓨터 사용자인 전문 회계사를 위한 것 (FastAccSlowComp.spdcon) 이다. 속도 등고선 FastCompSlowAcc.spdcon 는 시연을 담고 있는 오디오-비주얼 세그먼트 를 가속시키고, 회계 트랜잭션의 설명중에는 재생속도를 감속시키도록 TSM 속도를 지정한다. 속도 등고선 FastAccSlowComp.spdcon 은 시연중의 재생속도를 감속시키고, 회계 트랜잭션을 설명하는 오디오-비주얼 세그먼트에서는 가속시키도록 TSM 속도를 지정한다. 적절한 속도 등고선을 로딩함으로써, 각 타겟 청중은 오디오-비주얼 처리의 해당 세그먼트에 대한 특정의 이해도에 맞는 속도로 정보를 받아들일 수 있다. 결과적으로, 본 발명의 실시예는 다른 타겟 청중에 대하여 동일한 오디오-비쥬얼 처리의 다수의 버젼을 생성할 필요를 제거한 것이다.Examples of the method and apparatus of the present invention will be described below. A first use of the method and apparatus of the present invention relates to education using audio-visual tools. The apparatus of the present invention allows the TSM rate or playback rate of a particular audio-visual process to be controlled on a per-user basis or on a common basis targeting a particular listener group. For example, assume that educational audio-visual processing is used to elaborate on how viewers set up and use an order entry accounting system on a particular operating system to enter and report a particular type of financial transaction. Let's try it. It is also assumed that the target audience for educational audio-visual processing consists of two groups: (a) novice computer users, accountants (b) advanced computer users who are not familiar with standard accounting tasks. During playback of the audio-visual tool, material is given in the following manner. A specific financial transaction is described following the proper manipulation of the software program's user interface, such as "Select pull-down menu and enter NEW", followed by demonstration of the actual process. While this audio-visual processing is playing at normal speed, accounting professionals who are novice computer users are annoyed with the description of financial transactions already familiar, but because such people are unfamiliar with the use of these interfaces, You may think that the demonstration is too fast during the demonstration of how the entry of. Similarly for advanced computer users but novice accountants, the training speed (speaking speed) during the description of a particular financial transaction will be very cumbersome, but such people will be annoyed by the slow method demonstration of the entry process described above. Embodiments of the present invention solve this problem in the following manner. Two speed contours are mounted in the audio-visual processing. One speed contour is for advanced computer users who are novice accountants (FastCompSlowAcc.spdcon), and the other speed contour is for professional accountants who are novice computer users (FastAccSlowComp.spdcon). Speed contour FastCompSlowAcc.spdcon specifies the TSM speed to accelerate the audio-visual segment containing the demonstration and to slow down the playback speed during the accounting transaction description. Speed contour FastAccSlowComp.spdcon specifies the TSM speed to slow down playback during the demonstration and to accelerate it in the audio-visual segment that describes the accounting transaction. By loading the appropriate velocity contours, each target audience can accept information at a rate that is tailored to their particular understanding of that segment of the audio-visual process. As a result, embodiments of the present invention eliminate the need to create multiple versions of the same audio-visual processing for different target audiences.

전술한 예에서, 오디오-비주얼 처리의 사용자는 두 개의 특정 그룹으로 나뉘어 진다. 그러나, 많은 경우에 있어서, 오디오-비주얼 처리의 창안자 (creator) 가 작업에 제공되는 자료를 볼 청중의 이해속도를 알지 못한다. 이 러한 경우에서, 각 사용자는 키워드와 어구들이 있는 문장과 특정 컨셉에 대한 이상적인 프리잰태이션 속도에 대한 정보를 담고 있는 컨셉 속도관계 자료구조 (Conceptual Speed Association data structure) 를 로딩할 수 도 있다. 컨셉속도관계 자료구조는 사용자로 하여금 다른 자료에 대한 그들만의 이해 속도에 맞는 프리잰테이션 속도로 정보를 볼 수 있게 해준다.In the above example, the user of the audio-visual processing is divided into two specific groups. In many cases, however, the creator of the audio-visual process is unaware of the speed at which the audience sees the material provided to the task. In such a case, each user may load a conceptual speed association data structure containing information about keywords and phrases and phrases about the ideal rate of prediction for a particular concept. Concept speed relationship data structures allow users to view information at a presentation speed that is compatible with their own understanding of other data.

본 발명의 방법과 장치의 제 2 사용예는, 오디오-비주얼 도구를 사용한 엔터테인먼트 (entertainment) 와 관련된다. 본 발명의 실시예는 교육용 오디오-비주얼 처리중의 이해속도와 프리잰테이션 속도의 쌍을 이루는 것에 제한되지 않음은 기술분야의 당업자에게는 자명할 것이다. 사실, 본 발명의 실시예들은 또한 프리잰테이션 속도와 특정 오디오-비주얼 처리의 엔터테인먼트 레벨 또는 관심 레벨이 쌍을 이루는 문제를 해결하여, 청취자/시청자 (listener/viewer) 에게 더 큰 즐거움을 제공한다. 예를들어, 청취자 및 영화 관람자 등이 본 발명에 따라 CSA 자료구조를 채용하여 오디오 또는 오디오-비쥬얼 처리의 재생 속도를 제어함으로써, 폭력과 서스펜스의 장면 또는 문장들이 빠른 속도로 재생되도록 하여 과도한 불안을 피할 수도 있다. 유사하게, 로맨틱한 대화에 관심도 높은 청취자 및 영화 관람자들은 본 발명에 따라 CSA 자료구조 또는 속도 등고선을 사용하여 이러한 부분에 대해서는 재생속도를 감속시킬수도 있다. 쉽게 이해할 수 있듯이, 각 사용자 또는 패밀리는 "필터"로 기능하도록 그들의 관심도를 반영한 CSA 자료구조를 활용하고, 본 발명의 실시예들을 사용하여 보통 영화, 텔레비젼 쇼, 및 다른 엔터테인먼트 오디오 또는 오디오-비쥬얼 처리에 대하여 LIF 도구를 발생시킬 수도 있다. 또한, 쉽게 이해할 수 있듯이, 본 발명에 따른 가치있는 서비스로는 특정의 오디오 또는 오디오-비쥬얼 처리에 대하여 CSA 자료구조 또는 속도 등고선을 제공하여, 처리의 콘탠트를 변경하는데 사용될 수 있을 것이다. 예컨대, 성인용 언어 또는 컨셉을 담고 있는 특정의 구절을 제거하는 속도 등고선을 사용하여 영화 등급이 "R" 에서 "PG-13" 으로 변경될 수 있을 것이다.A second example of use of the method and apparatus of the present invention relates to entertainment using audio-visual tools. It will be apparent to those skilled in the art that embodiments of the present invention are not limited to pairing of understanding speed and presentation speed during educational audio-visual processing. In fact, embodiments of the present invention also solve the problem of pairing the presentation speed with the level of entertainment or interest of a particular audio-visual process, thus providing greater enjoyment for the listener / viewer. For example, listeners and movie viewers employ the CSA data structure in accordance with the present invention to control the playback speed of audio or audio-visual processing, thereby allowing scenes or sentences of violence and suspense to be reproduced at high speed, thereby easing excessive anxiety. It can also be avoided. Similarly, listeners and moviegoers who are interested in romantic conversation may use the CSA data structure or velocity contours to slow down playback for this portion. As can be readily understood, each user or family utilizes CSA data structures that reflect their interests to function as a "filter" and uses ordinary embodiments of the invention to process movies, television shows, and other entertainment audio or audio-visual You can also generate a LIF tool for. In addition, as will be readily appreciated, a valuable service according to the present invention may be used to change the content of a process by providing a CSA data structure or speed contour for a particular audio or audio-visual process. For example, a movie grade could be changed from "R" to "PG-13" using velocity contours to remove certain passages containing adult language or concepts.

CSA 자료 구조가 일단 생성되면, 그 이후로는 이전에 청취자가 듣지 않았았던 오디오 또는 오디오-비쥬얼 처리에 대한 재생속도를 가이드하는데 사용된다. 이러한 방법에서는, 다양한 오디오 또는 오디오-비쥬얼 처리를 청취함으로써 얻어진, 관심도, 이해 속도, 등을 나타내는 컨셉과 TSM 율의 쌍들이 캡쳐되고, 저장되고, 이후에도 사용되어서, 청취자에 의해 처음으로 재생되고 있는 오디오 또는 오디오-비쥬얼 처리에 대한 TSM 율 또는 재생속도를 가이드할 수 있다. 따라서, CSA 자료구조가 사용자가 듣지 못했던 처리에 대하여 자동적으로 사용자의 관심도에 맞추어진 속도 등고선을 생성하는데 또는 재생 속도를 제어하는데 사용될 수 있다. 이러한 듣지 않은 처리에 대하여 재생속도를 제어하는 또는 속도 등고선을 생성할 수 있음으로 인해 CSA 자료구조에 포함된 컨셉에 대한 사용자 관심도 레벨에 따라 사용자에게 제공되는 모든 오디오 및 오디오-비쥬얼 처리의 전달 속도를 맞춰 주는 정보 필터로써 본 발명의 실시예가 기능하도록 한다.Once the CSA data structure is created, it is then used to guide the playback speed for audio or audio-visual processing that was not previously heard by the listener. In this way, pairs of concepts and TSM rates, obtained by listening to various audio or audio-visual processes, representing interest, speed of understanding, etc., are captured, stored, and subsequently used to play audio for the first time by the listener. Or guide the TSM rate or playback rate for audio-visual processing. Thus, the CSA data structure can be used to automatically generate a speed contour that is tailored to the user's interest in processing that the user has not heard or to control the playback speed. The ability to control playback speed or generate speed contours for these unacknowledged processes allows the delivery of all audio and audio-visual processing to be delivered to the user depending on the level of user interest in the concepts contained in the CSA data structure. An embodiment of the present invention functions as a matching information filter.

본 발명의 방법 및 장치의 제 3 사용예로는 콘텐츠의 프로덕션 및 광고와 관련된다. 본 예에서는, 특정 타겟 청중 또는 마켓의 세그먼트를 샘플링함으로써 관심도를 캡쳐하고 청취자/시청자의 주목을 유지하는 속도 등고선이 결정될 수 있다. 예를 들어, 광고가 특정 브랜드 또는 컴퓨터 모델을 소유한 사람을 타겟으로 한다면, 광고 프로듀서는 광고를 촬영하고, 속도 등고선을 조절하여, 타겟 청중에게 적절한 프리잰테이션 속도로 정보를 전달함으로써 이들의 관심을 얻을 수 있다. 또한, 광고에 시연되는 대상 분야에 대한 방송 청중의 친숙도에 따라 다른 속도 등고선이 전개되고, 다른 라디오 또는 텔레비젼 방송국 및/또는 타임 슬롯으로 송출될 수 있다. 따라서, 본 발명에 따르면, 특정 광고방송이 홈-컴퓨터 정비에 대한 토크쇼 동안 재생될 때는 제 1 속도 등고선을 사용하여 20 초로 단축되고, 동일한 광고방송이 저녁뉴스동안 재생될 때는 컴퓨터 용어에 익숙치 않은 청취자들의 더 느린 이해속도를 위해서 제 2 속도 등고선을 사용하여 30 초로 연장될 수 있다.A third use of the method and apparatus of the present invention relates to the production and advertisement of content. In this example, a velocity contour can be determined that samples interest by sampling a segment of a particular target audience or market and keeps the listener / viewer's attention. For example, if an advertisement targets a person who owns a particular brand or computer model, the advertising producer shoots the advertisement, adjusts the speed contours, and delivers the information to the target audience at the appropriate presentation speed. Can be obtained. In addition, different speed contours may be developed and broadcasted to other radio or television stations and / or time slots, depending on the familiarity of the broadcast audience with the subject field being demonstrated. Thus, according to the present invention, when a particular commercial is played during a talk show about home-computer maintenance, it is shortened to 20 seconds using the first speed contour, and a listener unfamiliar with computer terminology when the same commercial is played during evening news. It can be extended to 30 seconds using the second velocity contours for their slower comprehension rates.

본 발명의 방법 및 장치의 제 4 사용예는, 느린 재생 속도를 지정한 TSM 율과 쌍을 이룬 숫자 (numeric digit) 에 대한 컨셉 엔트리를 담고있는 CSA 자료구조의 적용예이다. 이러한 경우, 본 발명의 방법은 청취자가 자신의 보이스 메일 메세지를 검색할 때, 보이스 메일 (voice mail) 시스템에 적용될 수 있다. 컨셉 결정기는 단순한 음성 인식을 수행하여 메세지상에 숫자의 존재를 결정할 것이다. 이러한 방법으로, 모든 전화번호 및 숫자적 량이 자동적으로 감속되고, 사용자의 택스트화 (받아적기) 과정을 쉽도록 해줄 것이다. 또한, 본 발명의 실시예는 제한없이 날짜와 주소와 같은 컨셉에 대하여 재생속도를 지정하는데도 사용될 수 있다.A fourth example use of the method and apparatus of the present invention is an application of a CSA data structure containing a concept entry for numeric digits paired with a TSM rate that specifies a slow playback speed. In this case, the method of the present invention can be applied to a voice mail system when the listener retrieves his voice mail message. The concept determiner will perform simple speech recognition to determine the presence of numbers in the message. In this way, all phone numbers and numbers are automatically decelerated and will ease the user's texting process. In addition, embodiments of the present invention can also be used to specify playback speeds for concepts such as dates and addresses without limitation.

본 발명의 방법과 장치의 제 5 사용예는, 외국어 교육 및 학습과 관련된다. 외국어 학습을 담고 있는 오디오 또는 오디오-비쥬얼 처리를 청취하는 학습자는 본 발명의 실시예를 활용하여 다양한 구절을 청취하면서 속도 등고선을 생성할 것이다. 다른 부분보다 더 천천히 재생되도록 요청되거나, 반복된 구절들 보여줌으로써, 속도 등고선이 자료에 대한 학습자의 이해 속도를 반영할 것이다. 다음, 속도 등고선이 학습자의 점수를 매기고, 향후 학습을 지시하는데 사용될 것이다. 예컨대, 빠르게 말해진 구절을 청취함에 있어, 단어 파싱 능력을 개발하는데 도움을 주는 추가의 학습을 제공하는, 주문형 속도 등고선을 사용하여 오디오 또는 오디오 비쥬얼 도구를 청취할 수 있을 것이다. 또한, 본 발명의 실시예를 사용하여 외국어로 시연되는 자료를 청취하면서 생성된 CSA 자료구조는, 어느 컨셉이 특정 학습자에게 문제가 되는지를 분석하는데 사용될 수 있다. 이러한 방법으로, 클래스로 동일한 오디오 또는 오디오-비쥬얼 처리가 시연되며, 여기서는 각 학습자가 본 발명을 활용하여 처리에 포함된 자료의 컨셉에 대한 이해 속도의 정보를 담고 있는 CSA 자료구조를 획득하게 될 수 있다. 다음, 교습자가 개인의 점수를 매기고 학습자 그룹 또는 각 학습자의 이해 속도를 측정할 수 있도록, CSA 자료구조가 도해적으로 제공되거나, 컨셉별로 주문될 수 있다. 예컨대, 사용자 요청의 재생 속도 및 컨셉에서 구체화된 대상 분야의 친숙도 또는 이해도에 관련된 측정법을 전개할 수 있을 것이다.A fifth use of the method and apparatus of the present invention relates to foreign language teaching and learning. A learner listening to an audio or audio-visual process containing foreign language learning will utilize the embodiments of the present invention to generate speed contours while listening to various phrases. By showing phrases that are requested to be played back more slowly than other parts, or showing repeated passages, the velocity contours will reflect the learner's understanding of the material. Next, speed contours will be used to score learners and to direct future learning. For example, in listening to fast spoken phrases, one may be able to listen to audio or audio visual tools using custom speed contours, which provide additional learning to help develop word parsing abilities. In addition, the CSA data structures generated while listening to materials demonstrated in foreign languages using embodiments of the present invention can be used to analyze which concepts are problematic for a particular learner. In this way, the same audio or audio-visual processing is demonstrated with the class, where each learner can use the present invention to obtain a CSA data structure containing information of the rate of understanding of the concepts of the materials involved in the processing. have. Next, CSA data structures can be provided graphically or ordered by concept, so that the instructor can score individuals and measure the rate of understanding of the learner group or each learner. For example, it may be possible to develop measures related to the familiarity or understanding of the subject field embodied in the concept and speed of reproduction of user requests.

당업자는 전술한 설명이 설명과 이해를 위해서만 제공된 것임을 알 것이다. 그렇듯이, 개시된 명확한 형태로 본 발명이 제한되거나, 남김없이 규명된다는 의도는 아니다.Those skilled in the art will appreciate that the foregoing description has been provided for the purposes of illustration and understanding only. As such, it is not intended to be exhaustive or to limit the invention to the precise form disclosed.

예컨대, 전술한 오디오 또는 오디오-비쥬얼 처리가 인터넷으로부터 본 발명의 실시예로 입력될 수 있음은 기술분야의 당업자에게는 자명할 것이다. 또한, 속도 등고선 또는 CSA 자료구조의 실시예가 인터넷상에 액세스되는 필터 정보로 사용될 수 있음은 기술분야의 당업자에게는 자명할 것이다. 또한, 본 발명의 실시예가 인터넷상의 오디오 또는 오디오-비쥬얼 처리를 액세스하는데 사용되는 검색 엔진의 일부에 포함될 수 있음은 자명할 것이다.For example, it will be apparent to one skilled in the art that the above-described audio or audio-visual processing can be input from the Internet into embodiments of the present invention. It will also be apparent to one skilled in the art that embodiments of speed contours or CSA data structures can be used as filter information accessed over the Internet. It will also be apparent that embodiments of the present invention may be included in some of the search engines used to access audio or audio-visual processing on the Internet.

또 다른 예로서, 본 발명의 실시예에서는, 속도 등고선이 예컨대, 오디오 또는 오디오-비쥬얼 처리의 특정 부분에 대한 친화도 등의 TSM 율 엔트리를 포함할 수 있다. 본 발명의 그러한 실시예에서는, "친화도" (또는 동일하게 번역되는 다른 표시) 의 TSM 율이 재생시스템이 친화도의 TSM 율과 관련된 오디오 또는 오디오-비쥬얼의 색션을 넘어가도록 (skip) 지시한다. 따라서, 그러한 실시예에 따르면, 사용자가 오디오 또는 오디오-비쥬얼 처리 검색 또는 청취할 때 특정 부분에 "관심없음" 을 지정할 수 있다.As another example, in an embodiment of the present invention, the velocity contour may include a TSM rate entry, such as, for example, an affinity for a particular portion of audio or audio-visual processing. In such an embodiment of the present invention, the TSM rate of "affinity" (or other notation equivalently translated) instructs the playback system to skip an audio or audio-visual section related to the TSM rate of affinity. . Thus, according to such an embodiment, a user may designate "no interest" in a particular portion when searching or listening to audio or audio-visual processing.

또 다른 예로서, 본 발명의 실시예는 다음을 포함함은 기술분야의 당업자에게는 자명할 것이다 (a) CSA 자료구조가 인코드된 컴퓨터 판독가능 매체, (b) 속도 등고선이 인코드된 컴퓨터 판독가능 매체, (c) CSA 자료구조와 함께 오디오 또는 오디오 비쥬얼 처리가 인코드된 컴퓨터 판독가능 매체, 및 (d) 속도 등고선과 함께 오디오 또는 오디오 비쥬얼 처리가 인코드된 컴퓨터 판독가능 매체. CSA 자료구조 또는 속도 등고선과 함께 오디오 또는 오디오 비쥬얼 처리가 인코드된 컴퓨터 판독가능 매체의 경우, CSA 자료구조 또는 속도 등고선과 함께 오디오 또는 오디오 비쥬얼 처리를 저장하는 방법으로는, 기술분야의 당업자에게 공지된 많은 방법이 있다.As another example, embodiments of the present invention will be apparent to those skilled in the art, including (a) computer readable media encoded with CSA data structures, (b) computer code encoded with speed contours Possible media, (c) computer readable media having audio or audio visual processing encoded with a CSA data structure, and (d) computer readable media having audio or audio visual processing encoded with speed contours. For computer readable media having audio or audio visual processing encoded with a CSA data structure or speed contour, a method of storing audio or audio visual processing with a CSA data structure or speed contour is known to those skilled in the art. There are many ways to be.

Claims

An apparatus for generating a velocity contour comprising affinity information used to obtain a time scale conversion (TSM) rate and identifier information used to obtain an identifier of a portion of audio or audio-visual processing associated with the TSM rate, the apparatus comprising:

A user input device that receives user information and directs input of a portion of audio or audio-visual processing,

A TSM system for generating a TSM portion in response to the identifier of the portion, the portion, and the TSM rate;

A TSM monitor responsive to the user information, the identifier of the portion, and the portion, generating a portion of the identifier and the TSM rate associated with the TSM rate, and

And a speed contour generator for generating said speed contour in response to an identifier of said associated portion and a TSM rate.

The method of claim 1,

And in response to said speed contour, storing a speed contour.

The method of claim 2,

The speed contour is stored with user authentication information included in the user information.

The method of claim 3, wherein

And the speed contour generator generates one speed contour that is a function of a plurality of speed contours generated for a particular user.

The method of claim 3, wherein

And the speed contour generator generates one speed contour that is a function of a plurality of speed contours generated for another user.

The method of claim 1,

The TSM monitor generates one or more specific TSM rates in response to predetermined user information,

And wherein the TSM system does not generate the TSM converted portion in response to the one or more specific TSM rates.

The method of claim 1,

And a display device for displaying the speed contour in response to the speed contour.

The method of claim 1,

And a playback device for reproducing the TSM portion in response to the TSM portion.

The method of claim 8,

And in response to the TSM portion, a storage device for storing the TSM portion.

The method of claim 1,

The affinity information comprises a measure of user interest in the portion.

The method of claim 1,

And said affinity information comprises a derivative of said TSM rate.

A method of generating a velocity contour comprising affinity information used to obtain a time scale conversion (TSM) rate and identifier information used to obtain an identifier of a portion of audio or audio-visual processing associated with the TSM rate, the method comprising:

Obtaining user information and directing input of a portion of audio or audio-visual processing,

In response to an identifier of the portion, the portion, and a TSM rate.

In response to the user information, the identifier of the portion, and the portion, generating a portion of the identifier associated with the TSM rate and the TSM rate, and

In response to the identifier of the relevant portion and the TSM rate, generating the speed contour.

Generate a conceptual velocity relationship data structure comprising affinity information used to obtain a time scale conversion (TSM) rate and concept information used to obtain a concept identifier of a portion of audio or audio-visual processing associated with the TSM rate. As a device to make

A user input device for receiving user information and directing input of a portion of audio or audio-visual processing,

A TSM system for generating a TSM portion in response to the identifier of the portion, the portion, and the TSM rate

A concept decoder for generating a concept of the portion in response to the identifier of the portion and the portion,

A TSM concept monitor, in response to the user information and the concept, generating a concept identifier and a TSM rate associated with the TSM rate, and

And a concept speed relationship data structure generator for generating the concept speed related data structure in response to the TSM rate and the associated concept identifier.

The method of claim 13,

And in response to the concept speed relationship data structure, a storage device for storing the concept speed relationship data structure.

The method of claim 14,

And said concept speed relationship data structure is stored together with user authentication information contained in user information.

The method of claim 15,

And said concept speed relationship data structure generator generates one concept speed relationship data structure that is a function of a plurality of concept speed relationship data structures generated for a particular user.

The method of claim 15,

And the concept speed relationship data structure generator generates one concept speed relationship data structure that is a function of a plurality of concept speed relationship data structures generated for a plurality of different users.

The method of claim 13,

The TSM concept monitor generates one or more specific TSM rates in response to predetermined user information,

And wherein the TSM system does not generate a TSM portion in response to the one or more specific TSM rates.

The method of claim 13,

And in response to the concept speed relationship data structure, display device for displaying the concept speed relationship data structure.

The method of claim 13,

The method of claim 20,

And responsive to the TSM portion, storing the TSM portion.

Generate a conceptual velocity relationship data structure comprising affinity information used to obtain a time scale conversion (TSM) rate and concept information used to obtain a concept identifier of a portion of audio or audio-visual processing associated with the TSM rate. As a way to make

In response to an identifier of the portion, the portion, and a TSM rate, generating a TSM version of the portion,

Generating a concept of the portion in response to the identifier of the portion and the portion,

Generating a concept related to the TSM rate and the TSM rate in response to the user information and the concept, and

In response to the TSM rate and the related concept, generating a concept velocity relationship data structure.

Audio or audio in conjunction with a velocity contour that includes affinity information used to obtain a time scale conversion (TSM) rate and identifier information used to obtain an identifier of a portion of audio or audio-visual processing associated with the TSM rate A device for reproducing visual processing,

An input device for directing input of a portion of audio or audio-visual processing,

A playback device for reproducing the TSM portion in response to the TSM portion, and

And a TSM rate determiner for generating a TSM rate in response to the speed contour and the identifier of the portion.

The method of claim 23, wherein

The input device further comprises a user input device for receiving user information,

And the timescale transformation also generates the TSM rate in response to the user information.

The method of claim 24,

The user information includes an offset,

And the TSM rate determiner uses the offset to generate the TSM rate.

The method of claim 24,

The user information includes a user specified quantization step at a TSM rate,

And wherein said TSM rate determiner generates a TSM rate using said user specified quantization step.

In conjunction with a velocity contour, including affinity information used to obtain a time scale conversion (TSM) rate and identifier information used to obtain an identifier of a portion of audio or audio-visual processing associated with the TSM rate. As a method of playing audio-visual processing,

Directing an input of a portion of the audio or audio-visual process,

Generating a TSM portion in response to an identifier of the portion, the portion, and a TSM rate,

Responsive to the TSM portion, playing the TSM portion, and

Generating a TSM rate in response to the portion's identifier and speed contours.

Associated with a concept velocity relationship data structure comprising affinity information used to obtain a time scale conversion (TSM) rate and concept information used to obtain a concept identifier of a portion of audio or audio-visual processing associated with the TSM rate. Device for reproducing audio or audio-visual processing,

An input device for directing input of a portion of the audio or audio-visual processing,

A reproducing apparatus reproducing the TSM portion in response to the TSM portion,

A concept decoder for generating a concept for the portion in response to the identifier of the portion and the portion, and

And responsive to the concept and the conceptual velocity relationship data structure, a TSM concept look-up for generating the TSM rate.

The method of claim 28,

And wherein said TSM concept lookup is for generating said TSM rate in response to said user information.

The method of claim 29,

The user information includes an offset,

Wherein the TSM concept lookup uses the offset to generate a TSM rate.

The method of claim 29,

The user information includes a user specified quantization step at a TSM rate,

Wherein said TSM concept lookup generates said TSM rate using said user specified quantization step.

The method of claim 29,

The user information includes a concept and a TSM rate for the concept,

And the TSM concept lookup generates a TSM rate using the user-specified concept and the TSM rate.

Associated with a concept velocity relationship data structure comprising affinity information used to obtain a time scale conversion (TSM) rate and concept information used to obtain a concept identifier of a portion of audio or audio-visual processing associated with the TSM rate. A method of playing audio or audio-visual processing,

Directing an input of a portion of the audio or audio-visual process,

In response to an identifier of the portion, the portion, and a TSM rate, generating a TSM portion,

Responsive to the TSM portion, playing the TSM portion,

In response to the identifier of the portion and the portion, generating a concept of the portion, and

And in response to the concept and the concept speed relation data structure, generating the TSM rate.

Obtaining a TSM rate, a velocity contour comprising affinity information used to obtain a time scale conversion (TSM) rate and identifier information used to obtain an identifier of a portion of audio or audio-visual processing associated with the TSM rate An apparatus for converting into a conceptual velocity relationship data structure comprising affinity information used to obtain and concept information used to obtain a concept identifier for a portion of audio or audio-visual processing associated with the TSM rate.

An input device, in response to the speed contours, determining a portion of the TSM rate and an identifier of the portion, and directing input of the portion of the audio or audio-visual process;

A concept decoder for generating a concept for the portion in response to the identifier of the portion and the portion,

A TSM concept monitor correlating the concept identifier with the TSM rate in response to the concept and the TSM rate, and

And a concept speed relationship data structure generator for generating the concept speed relationship data structure in response to the TSM rate and the associated concept identifier.

A concept velocity relationship data structure comprising affinity information used to obtain a time scale conversion (TSM) rate and concept information used to obtain a concept identifier for a portion of audio or audio-visual processing associated with the TSM; An apparatus for converting into concept contours comprising affinity information used to obtain a TSM rate and concept information used to obtain an identifier of a portion of audio or audio-visual processing associated with the TSM, comprising:

An input device for directing input of a portion of the audio or audio-visual process and determining an identifier of the portion;

A TSM concept lookup that generates the TSM rate in response to the concept and the conceptual velocity relationship data structure, and

And a speed contour generator for generating the speed contour in response to the TSM rate and the identifier of the associated portion.

A method for generating a listener interest filtering tool for audio or audio-visual processing,

Generating, for one or more user categories, one or more average velocity contours for one or more audio or audio visual processing,

Converting the one or more average velocity contours into one or more conceptual velocity relationship data structures, and

And forming one listener interest filtered concept velocity relationship data structure from the one or more concept velocity relationship data structures.

The method of claim 36,

Converting the listener interest filtered conceptual velocity relationship data structure into a listener interest filtered velocity contour.

The method of claim 36,

Generating a listener interest filtered audio or audio-visual process using the listener interest filtered concept velocity relationship data structure.

The method of claim 37,

Generating a listener interest filtered audio or audio-visual process using the listener interest filtered rate contour.

The method of claim 36,

Modifying the listener interest filtering filtered concept velocity relationship data structure.

The method of claim 13,

The concept decoder comprises an algorithm for recognizing a category of concepts

42. The method of claim 41 wherein

One of the categories comprises a number.

A user input device for receiving user information and directing input of a portion of the audio or audio-visual process;

A display device for displaying a representation of an input portion and a TSM rate associated with the portion in response to an identifier of the portion;

In response to the user information, an editor for providing an identifier of a user selected portion of the display and a TSM rate specified by the user for the portion selected by the user;

And a speed contour generator for generating a speed contour in response to the user specified TSM rate, the associated TSM rate, and the identifier of the user selected portion.

The method of claim 43,

Wherein the display of the display device includes text material from closed caption material associated with the input portion.

The method of claim 43,

And the display of the display device includes text material determined by speech recognition of the input portion.

Encoded into a CSA data structure containing affinity information used to obtain a time scale conversion (TSM) rate and concept information used to obtain a concept identifier of a portion of audio or audio-visual processing associated with the TSM rate. Computer readable media.

The method of claim 46,

And the computer readable medium is further encoded with audio or audio-visual processing.

Velocity-encoded computer readings comprising affinity information used to obtain a time scale conversion (TSM) rate and identifier information used to obtain an identifier of a portion of audio or audio-visual processing associated with the TSM rate. Media available.

49. The method of claim 48 wherein