EP1002266A1 - Multimedia-anzeigesystem - Google Patents

Multimedia-anzeigesystem

Info

Publication number
EP1002266A1
EP1002266A1 EP98939330A EP98939330A EP1002266A1 EP 1002266 A1 EP1002266 A1 EP 1002266A1 EP 98939330 A EP98939330 A EP 98939330A EP 98939330 A EP98939330 A EP 98939330A EP 1002266 A1 EP1002266 A1 EP 1002266A1
Authority
EP
European Patent Office
Prior art keywords
image
data
display
tracks
sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP98939330A
Other languages
English (en)
French (fr)
Other versions
EP1002266B1 (de
Inventor
Kagenori Nagao
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HP Inc
Original Assignee
Hewlett Packard Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett Packard Co filed Critical Hewlett Packard Co
Publication of EP1002266A1 publication Critical patent/EP1002266A1/de
Application granted granted Critical
Publication of EP1002266B1 publication Critical patent/EP1002266B1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0091Means for obtaining special acoustic effects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/155Musical effects
    • G10H2210/265Acoustic effect simulation, i.e. volume, spatial, resonance or reverberation effects added to a musical sound, usually by appropriate filtering or delays
    • G10H2210/295Spatial effects, musical uses of multiple audio channels, e.g. stereo
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2220/00Input/output interfacing specifically adapted for electrophonic musical tools or instruments
    • G10H2220/091Graphical user interface [GUI] specifically adapted for electrophonic musical instruments, e.g. interactive musical displays, musical instrument icons or menus; Details of user interactions therewith
    • G10H2220/101Graphical user interface [GUI] specifically adapted for electrophonic musical instruments, e.g. interactive musical displays, musical instrument icons or menus; Details of user interactions therewith for graphical creation, edition or control of musical data or parameters
    • G10H2220/106Graphical user interface [GUI] specifically adapted for electrophonic musical instruments, e.g. interactive musical displays, musical instrument icons or menus; Details of user interactions therewith for graphical creation, edition or control of musical data or parameters using icons, e.g. selecting, moving or linking icons, on-screen symbols, screen regions or segments representing musical elements or parameters

Definitions

  • the present invention relates to display systems for playing multi-media works, and more particularly, to a sound processing system that alters a sound track in response to the cropping of an image associated with the sound track.
  • Multi-media works consisting of still or moving images with narration, background sounds, and background music are becoming common. Such works may be found on the Internet or on CD-ROM. Systems for displaying motion pictures with sound on computers and other data processing systems are also common utilizing programs such as VIDEO FOR WINDOWS to reproduce the work on computers. Furthermore, 3 -dimensional modeling of sound can be specified in VRML 2.0. In a VRML 2.0 compliant browser, the sound generated by the components of a scene are specified by providing separate sound tracks for each sound source together with the location of that sound source in the scene. The sound observed by a listener facing any direction at any position relative to the sound source can then be reproduced by combining the individual sound sources.
  • VIDEO FOR WINDOWS does not provide the ability to crop the motion picture image and to display the cropped image on the screen. For that reason, a conventional AVI file, which is motion picture file used by VIDEO FOR WINDOWS, normally does not include data for controlling multiple audio streams in response to the position of a cropping frame in the motion picture image.
  • a conventional program such as VIDEO FOR WINDOWS lacks the ability control the audio signals decoded from the multiple audio streams in response to the user defining the position of a cropping frame in the motion picture image.
  • VRML 2.0 provides the data needed to generate a sound track corresponding to the point of view of the user thereby creating a 3 -dimensional sound image that can be changed in response to cropping, etc.
  • systems implementing VRML 2.0 do not alter the "sound image" in response to changes in the visual image.
  • the sound model implemented by VRML 2.0 is customized to implement 3-dimensional sound effects, and is poorly suited for applications that process audio data linked to 2-dimensional images. Therefore, none of the existing programs can automatically control the audio to match the user's definition of a cropping frame in the motion picture image.
  • the present invention is a display system for performing a multi-media work that includes image data representing a still or moving image, and sound data associated with the image data.
  • the system includes a display for displaying an image derived from the image data, an audio playback system for combining and playing first and second audio tracks linked to the image, and a pointing system for selecting a region of the image on the display in response to commands from a user of the display system.
  • the system also includes a playback processor for altering the combination of the first and second audio tracks played by the audio playback system in response to the pointing system selecting a new region of the image.
  • the playback processor also alters the display such that the portion of the image selected by the pointing system is centered in the display.
  • the first and second audio tracks include sound tracks to be mixed prior to playback.
  • the image includes data specifying gains to be used in the mixing for the sound tracks when the selected region of the display is centered at predetermined locations in the image. If the predetermined locations do not include the center of the selected region, the playback system interpolates the data for the predetermined locations to provide the gains to be used in mixing the sound tracks.
  • the multi-media work includes data for specifying images at multiple resolutions. In this embodiment, the pointing system further selects one of the resolutions in response to input from the user. The playback processor then alters the combination of the first and second audio tracks played by the audio playback system in response to both the selected region and the selected resolution.
  • Figure 1 illustrates a simple multi-media display.
  • Figure 2 is a schematic drawing of an image display system according to one embodiment of the present invention.
  • Figure 3 is a block diagram of a sound and image processing system according to another embodiment of the present invention.
  • Figure 4 illustrates the interpolation of the sound processing parameters stored for selected pixels in an image to obtain new sound processing parameters.
  • FIG. 1 illustrates a simple multi-media display.
  • the display consists of an image 11 of a piano 15 and a bass 16 and a sound tract of a musical work generated by the two instruments.
  • the sound track is played through a stereo sound system consisting of speakers 17 and 18.
  • the stereo sound track is constructed from two audio tracks, one for the piano and one for the bass.
  • Each audio track has right and left hand components which are mixed to generate the signals sent to speakers 17 and 18.
  • the mixing of the signals consistent with image 11 generates an "acoustical image" in which the piano appears to be located closer to speaker 17, and the bass appears to be located closer to speaker 18.
  • Many playback systems allow the user to zoom into various portions of the display by defining a cropping frame around the desired portion.
  • the cropped image is then re-displayed in its own frame.
  • the cropped image is enlarged to fill the original frame.
  • prior art systems do not alter the acoustical image to take into account the new visual image.
  • the cropped image shown in cropping frame 12 would have an acoustical image in which piano 15 still appears to be located at the same position in the cropped frame as it occupied in the original frame. That is, piano 15 still appears to be closer to speaker 17 even though it is now in the middle of the new frame. This inconsistency in the acoustical and visual images is distracting to human observers.
  • the present invention overcomes this problem with prior art displays by altering the acoustical image in response to the cropping of the original image.
  • a cropping frame such as frame 14
  • the sound tracks are remixed such that the apparent sound sources are likewise shifted in position in the acoustical image.
  • the sound of the bass would be moved such that it was equidistant between speakers 17 and 18 when the viewing frame is switched from the original frame 11 to that shown in cropped frame 14.
  • FIG 2 is a schematic drawing of an image display system 50 according to one embodiment of the present invention.
  • the user specifies a cropping frame using, for example, a pointer 65 applied to image data 57 that is displayed on display 70.
  • the cropped image boundary is input via a cropping controller 51 which sends the limits of the new frame to the appropriate cropping routine 52 in the display system.
  • the new image boundaries are also sent to a gain controller 53, which controls the mixing of the right and left speaker signal components generated for each audio track.
  • the audio tracks are separately processed and then mixed in the playback system 66 via summing amplifiers 58 and 59 to provide the final left and right signals that are sent to the right and left audio channels, 61 and 62, of the stereo system.
  • Exemplary audio tracks are shown at 54-56.
  • Each audio track includes left and right components whose relative gain is determined by the gain settings applied to a corresponding pair of amplifiers.
  • the amplifiers corresponding to audio track 54 are shown at 63 and 64.
  • the sound track attributes of each source are specified for each pixel at position (x, y) in the image.
  • the information stored for each pixel, P(x,y) could include the left and right channel gains for each audio track in addition to the image pixel value v, i.e.,
  • the data from Eq. (1) for the pixel that is now at the center of the display can then be used to recompute the audio attributes by altering the relative mixing of each sound track in accordance with location of the sound source for that sound track within the new frame created by cropping the old frame.
  • a multi -resolution image is defined to be an image that can be viewed at two or more different magnifications.
  • Such an image may be specified by a zoom setting.
  • zoom into the image i.e., increase the magnification
  • the user can point to a specific location in the image.
  • the display system selects the region centered at the new position at the next highest resolution level to fill the display area.
  • the display system crops the next highest resolution image at the boundaries of the display window.
  • a zoom operation may alter the effective position of the viewer with respect to the image both in terms of left-right alignment and distance.
  • both the volume of the various audio tracks and the relative gams of the right and left channels must be adjusted to provide a realistic sound track when the image is zoomed
  • the data needed to re-compute the left-right balance and amplitude for each audio source may be specified by specifying the gains for each of the left-right amplifiers at the various resolutions That is, the attribute P(x, y, r) of the pixels in resolution layer r and position (x, y) is defined to include the channel amplifications to be used when the pixel at (x,y) becomes the center of the scene, I e ,
  • the multi-media work includes image data representing a moving image that includes a sequence of frames
  • the above described methods may be applied frame by frame by including the sound values for each pixel in each frame of the moving picture so that the audio tracks can be adjusted when that pixel becomes the center of the frame, 1 e , for the pixel at (x,y) in frame f,
  • v is the image pixel value for the relevant pixel in the image
  • Rl, LI, R2, L2, , Rn, Ln are the left and right channel gains for audio sources 1 to n, respectively Accordingly, the stereo orientation can be changed over time in response to a change m the viewing area
  • audio channel amplitudes for the va ⁇ ous resolution layers may be stored for each frame to allow the relative volumes of the audio sources to be adjusted in time with changes in the visual viewing field specified by zooming in or out.
  • FIG. 3 is a block diagram of a sound and image processing system 150 according to another embodiment of the present invention.
  • elements of system 150 which serve analogous functions to elements shown in Figure 2 have been given reference numerals that differ by 100 from those used for the analogous functioning elements in Figure 2.
  • system 150 the user again specifies a region of the image for cropping or zooming.
  • the information on the new scene is converted by filter controller 153 into a set of filter coefficients that are applied to the relevant sound tracks by digital filters.
  • Exemplary digital filters are shown at 163 and 164. Each digital filter coefficient changes in relation to the (x, y) coordinates of the center of the cropping region, the resolution layer, r, and the moving image frame and position.
  • the music tracks in a scene of a concert hall can be altered to include echoes that are altered as the scene is zoomed in or out thereby producing a more realistic sound track.
  • the processing can be customized for either binaural recording in which the sound is played back through headphones or transaural playback in which the sound is played back through stereo speakers but not headphones. In either case, the source of the sound is modified to correspond to the correct location in the modified display selected by the user.
  • the left and right channel gains of each audio data are given as the P(x, y) elements described above when the coordinate at the center of the cropping region is (x, y) for specific points as shown in Figure 4 at 201-204
  • the left and right channel gains for a cropped frame having a center as shown at 205 may be obtained from values stored for points 201-204 by interpolating the values shown for points 201-204.
  • the present invention has been described in terms of a display system, it will be obvious to those skilled in the art from the preceding discussion that the present invention may be practiced on any general purpose data processing system equipped to playback a multi-media work. In this case, the present invention can be implement by altering the playback routines to provide the various user input function and mixing functions described above with reference to the display system embodiments of the invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Television Signal Processing For Recording (AREA)
  • Stereophonic System (AREA)
  • Controls And Circuits For Display Device (AREA)
  • User Interface Of Digital Computer (AREA)
  • Digital Computer Display Output (AREA)
EP98939330A 1997-08-12 1998-08-11 Multimedia-anzeigesystem Expired - Lifetime EP1002266B1 (de)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP21728197A JPH1175151A (ja) 1997-08-12 1997-08-12 音声処理機能付き画像表示システム
JP21728197 1997-08-12
PCT/US1998/016636 WO1999008180A1 (en) 1997-08-12 1998-08-11 Multi-media display system

Publications (2)

Publication Number Publication Date
EP1002266A1 true EP1002266A1 (de) 2000-05-24
EP1002266B1 EP1002266B1 (de) 2006-12-27

Family

ID=16701687

Family Applications (1)

Application Number Title Priority Date Filing Date
EP98939330A Expired - Lifetime EP1002266B1 (de) 1997-08-12 1998-08-11 Multimedia-anzeigesystem

Country Status (6)

Country Link
EP (1) EP1002266B1 (de)
JP (1) JPH1175151A (de)
KR (1) KR20010022769A (de)
CN (1) CN1126026C (de)
DE (1) DE69836742T2 (de)
WO (1) WO1999008180A1 (de)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101211642B (zh) * 2006-12-30 2011-05-04 上海乐金广电电子有限公司 音频播放装置中音频文件播放方法及其装置

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4686402B2 (ja) * 2006-04-27 2011-05-25 オリンパスイメージング株式会社 カメラ、再生装置、再生制御方法
JP2008154065A (ja) * 2006-12-19 2008-07-03 Roland Corp 効果付与装置
KR20110005205A (ko) * 2009-07-09 2011-01-17 삼성전자주식회사 디스플레이 장치의 화면 사이즈를 이용한 신호 처리 방법 및 장치
CN104036789B (zh) * 2014-01-03 2018-02-02 北京智谷睿拓技术服务有限公司 多媒体处理方法及多媒体装置
JP2015142185A (ja) * 2014-01-27 2015-08-03 日本電信電話株式会社 視聴方法、視聴端末及び視聴プログラム
JP2017134713A (ja) * 2016-01-29 2017-08-03 セイコーエプソン株式会社 電子機器、電子機器の制御プログラム
KR102332739B1 (ko) * 2016-05-30 2021-11-30 소니그룹주식회사 음향 처리 장치 및 방법, 그리고 프로그램
CN111966278B (zh) * 2020-08-28 2022-03-25 网易(杭州)网络有限公司 终端设备的提示方法、终端设备以及存储介质
WO2023067715A1 (ja) * 2021-10-20 2023-04-27 日本電信電話株式会社 情報提示システム、装置、方法およびプログラム

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5027689A (en) * 1988-09-02 1991-07-02 Yamaha Corporation Musical tone generating apparatus
GB8924334D0 (en) * 1989-10-28 1989-12-13 Hewlett Packard Co Audio system for a computer display
US5212733A (en) * 1990-02-28 1993-05-18 Voyager Sound, Inc. Sound mixing device
DE69322805T2 (de) * 1992-04-03 1999-08-26 Yamaha Corp. Verfahren zur Steuerung von Tonquellenposition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9908180A1 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101211642B (zh) * 2006-12-30 2011-05-04 上海乐金广电电子有限公司 音频播放装置中音频文件播放方法及其装置

Also Published As

Publication number Publication date
DE69836742D1 (de) 2007-02-08
CN1126026C (zh) 2003-10-29
CN1266511A (zh) 2000-09-13
WO1999008180A1 (en) 1999-02-18
DE69836742T2 (de) 2007-04-26
KR20010022769A (ko) 2001-03-26
EP1002266B1 (de) 2006-12-27
JPH1175151A (ja) 1999-03-16

Similar Documents

Publication Publication Date Title
US6573909B1 (en) Multi-media display system
US5715318A (en) Audio signal processing
AU756265B2 (en) Apparatus and method for presenting sound and image
US7881479B2 (en) Audio processing method and sound field reproducing system
US5636283A (en) Processing audio signals
KR101355414B1 (ko) 오디오 신호 처리 장치, 오디오 신호 처리 방법 및 오디오신호 처리 프로그램
KR100854122B1 (ko) 가상음상정위 처리장치, 가상음상정위 처리방법 및 기록매체
US6829017B2 (en) Specifying a point of origin of a sound for audio effects using displayed visual information from a motion picture
US7734362B2 (en) Calculating a doppler compensation value for a loudspeaker signal in a wavefield synthesis system
RU2735095C2 (ru) Устройство и способ аудиообработки, и программа
JPH0934392A (ja) 音とともに画像を提示する装置
CN108476367A (zh) 用于沉浸式音频回放的信号的合成
US5798922A (en) Method and apparatus for electronically embedding directional cues in two channels of sound for interactive applications
JP3315363B2 (ja) 動画像再生品質制御装置およびその制御方法
EP1002266B1 (de) Multimedia-anzeigesystem
JP2003284196A (ja) 音像定位信号処理装置および音像定位信号処理方法
US5682433A (en) Audio signal processor for simulating the notional sound source
US20020037084A1 (en) Singnal processing device and recording medium
JP2004187288A (ja) 音源映像の表示領域からその音声を出力させる映像音声再生方法
JP2007116363A (ja) 音響空間制御装置
JP7513020B2 (ja) 情報処理装置および方法、再生装置および方法、並びにプログラム
JPH08298635A (ja) 音声チャンネル選択合成方法およびこの方法を実施する装置
Poirier-Quinot et al. RoomZ: Spatial panning plugin for dynamic RIR convolution auralisations
JPH1042398A (ja) サラウンド再生方法及び装置
CN115103293B (zh) 一种面向目标的声重放方法及装置

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20000110

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): DE FR GB

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: HEWLETT-PACKARD COMPANY, A DELAWARE CORPORATION

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): DE FR GB

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REF Corresponds to:

Ref document number: 69836742

Country of ref document: DE

Date of ref document: 20070208

Kind code of ref document: P

ET Fr: translation filed
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed

Effective date: 20070928

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20090817

Year of fee payment: 12

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

Effective date: 20110502

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20100831

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20110830

Year of fee payment: 14

Ref country code: GB

Payment date: 20110825

Year of fee payment: 14

REG Reference to a national code

Ref country code: DE

Ref legal event code: R082

Ref document number: 69836742

Country of ref document: DE

Representative=s name: SCHOPPE, ZIMMERMANN, STOECKELER, ZINKLER & PAR, DE

REG Reference to a national code

Ref country code: DE

Ref legal event code: R082

Ref document number: 69836742

Country of ref document: DE

Representative=s name: SCHOPPE, ZIMMERMANN, STOECKELER, ZINKLER & PAR, DE

Effective date: 20120229

Ref country code: DE

Ref legal event code: R081

Ref document number: 69836742

Country of ref document: DE

Owner name: HEWLETT-PACKARD DEVELOPMENT CO., L.P., US

Free format text: FORMER OWNER: HEWLETT-PACKARD CO. (N.D.GES.D.STAATES DELAWARE), PALO ALTO, US

Effective date: 20120229

REG Reference to a national code

Ref country code: GB

Ref legal event code: 732E

Free format text: REGISTERED BETWEEN 20120329 AND 20120404

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20120811

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20130301

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20120811

REG Reference to a national code

Ref country code: DE

Ref legal event code: R119

Ref document number: 69836742

Country of ref document: DE

Effective date: 20130301