WO2019027812A1 - Classification d'objet audio sur la base de métadonnées de localisation - Google Patents

Classification d'objet audio sur la base de métadonnées de localisation Download PDF

Info

Publication number
WO2019027812A1
WO2019027812A1 PCT/US2018/043980 US2018043980W WO2019027812A1 WO 2019027812 A1 WO2019027812 A1 WO 2019027812A1 US 2018043980 W US2018043980 W US 2018043980W WO 2019027812 A1 WO2019027812 A1 WO 2019027812A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
audio object
dialog
location metadata
objects
Prior art date
Application number
PCT/US2018/043980
Other languages
English (en)
Inventor
Mark William GERRARD
Original Assignee
Dolby Laboratories Licensing Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dolby Laboratories Licensing Corporation filed Critical Dolby Laboratories Licensing Corporation
Priority to CN201880049177.7A priority Critical patent/CN110998724B/zh
Priority to US16/636,241 priority patent/US11386913B2/en
Priority to EP18747091.9A priority patent/EP3662470B1/fr
Publication of WO2019027812A1 publication Critical patent/WO2019027812A1/fr

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals

Abstract

La présente invention concerne des procédés (700, 800, 900), des systèmes (200, 300, 400, 500, 600) et des produits de programme informatique. Des métadonnées de localisation (620) associées à un objet audio sont reçues (801). Les métadonnées de localisation définissent une position de l'objet audio dans une scène audio. Il est estimé (630, 802), sur la base des métadonnées de localisation, si l'objet audio comprend un dialogue. Une valeur représentative d'un résultat de l'estimation est attribuée (803) à un paramètre de type d'objet (231). Dans certains modes de réalisation exemplaires, des objets audio sont sélectionnés (661, 662, 804) sur la base de valeurs de leurs paramètres de type d'objet respectifs. Dans certains modes de réalisation exemplaires, au moins un des objets audio sélectionnés est soumis à une accentuation de dialogue (690, 807).
PCT/US2018/043980 2017-08-01 2018-07-26 Classification d'objet audio sur la base de métadonnées de localisation WO2019027812A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201880049177.7A CN110998724B (zh) 2017-08-01 2018-07-26 基于位置元数据的音频对象分类
US16/636,241 US11386913B2 (en) 2017-08-01 2018-07-26 Audio object classification based on location metadata
EP18747091.9A EP3662470B1 (fr) 2017-08-01 2018-07-26 Classification d'objet audio basée sur des métadonnées de localisation

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201762539599P 2017-08-01 2017-08-01
US62/539,599 2017-08-01
EP17184244.6 2017-08-01
EP17184244 2017-08-01

Publications (1)

Publication Number Publication Date
WO2019027812A1 true WO2019027812A1 (fr) 2019-02-07

Family

ID=59506166

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2018/043980 WO2019027812A1 (fr) 2017-08-01 2018-07-26 Classification d'objet audio sur la base de métadonnées de localisation

Country Status (1)

Country Link
WO (1) WO2019027812A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100121634A1 (en) 2007-02-26 2010-05-13 Dolby Laboratories Licensing Corporation Speech Enhancement in Entertainment Audio
US20150332680A1 (en) 2012-12-21 2015-11-19 Dolby Laboratories Licensing Corporation Object Clustering for Rendering Object-Based Audio Content Based on Perceptual Criteria
US20160078879A1 (en) 2013-03-26 2016-03-17 Dolby Laboratories Licensing Corporation Apparatuses and Methods for Audio Classifying and Processing
WO2016172111A1 (fr) * 2015-04-20 2016-10-27 Dolby Laboratories Licensing Corporation Traitement de données audio pour compenser une perte auditive partielle ou un environnement auditif indésirable
US20170098452A1 (en) * 2015-10-02 2017-04-06 Dts, Inc. Method and system for audio processing of dialog, music, effect and height objects

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100121634A1 (en) 2007-02-26 2010-05-13 Dolby Laboratories Licensing Corporation Speech Enhancement in Entertainment Audio
US20150332680A1 (en) 2012-12-21 2015-11-19 Dolby Laboratories Licensing Corporation Object Clustering for Rendering Object-Based Audio Content Based on Perceptual Criteria
US20160078879A1 (en) 2013-03-26 2016-03-17 Dolby Laboratories Licensing Corporation Apparatuses and Methods for Audio Classifying and Processing
WO2016172111A1 (fr) * 2015-04-20 2016-10-27 Dolby Laboratories Licensing Corporation Traitement de données audio pour compenser une perte auditive partielle ou un environnement auditif indésirable
US20170098452A1 (en) * 2015-10-02 2017-04-06 Dts, Inc. Method and system for audio processing of dialog, music, effect and height objects

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KUBA LOPATKA ET AL: "Improving listeners' experience for movie playback through enhancing dialogue clarity in soundtracks", DIGITAL SIGNAL PROCESSING., vol. 48, 1 January 2016 (2016-01-01), US, pages 40 - 49, XP055446566, ISSN: 1051-2004, DOI: 10.1016/j.dsp.2015.08.015 *

Similar Documents

Publication Publication Date Title
US11064310B2 (en) Method, apparatus or systems for processing audio objects
EP2936485B1 (fr) Groupage d'objets pour le rendu du contenu des objets audio sur la base des critères perceptuels
US9282417B2 (en) Spatial sound reproduction
US10638246B2 (en) Audio object extraction with sub-band object probability estimation
AU2006233504B2 (en) Apparatus and method for generating multi-channel synthesizer control signal and apparatus and method for multi-channel synthesizing
CN104584121B (zh) 音频水印的缩混补偿方法、系统及装置
US20150358753A1 (en) Method and apparatus for reproducing three-dimensional sound
JP2014515906A (ja) オーディオをアップミックスして3dオーディオを生成する方法とシステム[関連出願との相互参照]この出願は、2011年4月18日に出願された米国特許仮出願第61/476,395号の優先権を主張するものである。この文献はここにその全体を参照援用する。
US11386913B2 (en) Audio object classification based on location metadata
EP3198594A1 (fr) Introduction d'objets sonores dans un signal audio à mixage réducteur
WO2019027812A1 (fr) Classification d'objet audio sur la base de métadonnées de localisation
JP2023514121A (ja) ビデオ情報に基づく空間オーディオ拡張

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18747091

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018747091

Country of ref document: EP

Effective date: 20200302