WO2019027812A1 - Classification d'objet audio sur la base de métadonnées de localisation - Google Patents
Classification d'objet audio sur la base de métadonnées de localisation Download PDFInfo
- Publication number
- WO2019027812A1 WO2019027812A1 PCT/US2018/043980 US2018043980W WO2019027812A1 WO 2019027812 A1 WO2019027812 A1 WO 2019027812A1 US 2018043980 W US2018043980 W US 2018043980W WO 2019027812 A1 WO2019027812 A1 WO 2019027812A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio
- audio object
- dialog
- location metadata
- objects
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
Abstract
La présente invention concerne des procédés (700, 800, 900), des systèmes (200, 300, 400, 500, 600) et des produits de programme informatique. Des métadonnées de localisation (620) associées à un objet audio sont reçues (801). Les métadonnées de localisation définissent une position de l'objet audio dans une scène audio. Il est estimé (630, 802), sur la base des métadonnées de localisation, si l'objet audio comprend un dialogue. Une valeur représentative d'un résultat de l'estimation est attribuée (803) à un paramètre de type d'objet (231). Dans certains modes de réalisation exemplaires, des objets audio sont sélectionnés (661, 662, 804) sur la base de valeurs de leurs paramètres de type d'objet respectifs. Dans certains modes de réalisation exemplaires, au moins un des objets audio sélectionnés est soumis à une accentuation de dialogue (690, 807).
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201880049177.7A CN110998724B (zh) | 2017-08-01 | 2018-07-26 | 基于位置元数据的音频对象分类 |
US16/636,241 US11386913B2 (en) | 2017-08-01 | 2018-07-26 | Audio object classification based on location metadata |
EP18747091.9A EP3662470B1 (fr) | 2017-08-01 | 2018-07-26 | Classification d'objet audio basée sur des métadonnées de localisation |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201762539599P | 2017-08-01 | 2017-08-01 | |
US62/539,599 | 2017-08-01 | ||
EP17184244.6 | 2017-08-01 | ||
EP17184244 | 2017-08-01 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019027812A1 true WO2019027812A1 (fr) | 2019-02-07 |
Family
ID=59506166
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2018/043980 WO2019027812A1 (fr) | 2017-08-01 | 2018-07-26 | Classification d'objet audio sur la base de métadonnées de localisation |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2019027812A1 (fr) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100121634A1 (en) | 2007-02-26 | 2010-05-13 | Dolby Laboratories Licensing Corporation | Speech Enhancement in Entertainment Audio |
US20150332680A1 (en) | 2012-12-21 | 2015-11-19 | Dolby Laboratories Licensing Corporation | Object Clustering for Rendering Object-Based Audio Content Based on Perceptual Criteria |
US20160078879A1 (en) | 2013-03-26 | 2016-03-17 | Dolby Laboratories Licensing Corporation | Apparatuses and Methods for Audio Classifying and Processing |
WO2016172111A1 (fr) * | 2015-04-20 | 2016-10-27 | Dolby Laboratories Licensing Corporation | Traitement de données audio pour compenser une perte auditive partielle ou un environnement auditif indésirable |
US20170098452A1 (en) * | 2015-10-02 | 2017-04-06 | Dts, Inc. | Method and system for audio processing of dialog, music, effect and height objects |
-
2018
- 2018-07-26 WO PCT/US2018/043980 patent/WO2019027812A1/fr unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100121634A1 (en) | 2007-02-26 | 2010-05-13 | Dolby Laboratories Licensing Corporation | Speech Enhancement in Entertainment Audio |
US20150332680A1 (en) | 2012-12-21 | 2015-11-19 | Dolby Laboratories Licensing Corporation | Object Clustering for Rendering Object-Based Audio Content Based on Perceptual Criteria |
US20160078879A1 (en) | 2013-03-26 | 2016-03-17 | Dolby Laboratories Licensing Corporation | Apparatuses and Methods for Audio Classifying and Processing |
WO2016172111A1 (fr) * | 2015-04-20 | 2016-10-27 | Dolby Laboratories Licensing Corporation | Traitement de données audio pour compenser une perte auditive partielle ou un environnement auditif indésirable |
US20170098452A1 (en) * | 2015-10-02 | 2017-04-06 | Dts, Inc. | Method and system for audio processing of dialog, music, effect and height objects |
Non-Patent Citations (1)
Title |
---|
KUBA LOPATKA ET AL: "Improving listeners' experience for movie playback through enhancing dialogue clarity in soundtracks", DIGITAL SIGNAL PROCESSING., vol. 48, 1 January 2016 (2016-01-01), US, pages 40 - 49, XP055446566, ISSN: 1051-2004, DOI: 10.1016/j.dsp.2015.08.015 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11064310B2 (en) | Method, apparatus or systems for processing audio objects | |
EP2936485B1 (fr) | Groupage d'objets pour le rendu du contenu des objets audio sur la base des critères perceptuels | |
US9282417B2 (en) | Spatial sound reproduction | |
US10638246B2 (en) | Audio object extraction with sub-band object probability estimation | |
AU2006233504B2 (en) | Apparatus and method for generating multi-channel synthesizer control signal and apparatus and method for multi-channel synthesizing | |
CN104584121B (zh) | 音频水印的缩混补偿方法、系统及装置 | |
US20150358753A1 (en) | Method and apparatus for reproducing three-dimensional sound | |
JP2014515906A (ja) | オーディオをアップミックスして3dオーディオを生成する方法とシステム[関連出願との相互参照]この出願は、2011年4月18日に出願された米国特許仮出願第61/476,395号の優先権を主張するものである。この文献はここにその全体を参照援用する。 | |
US11386913B2 (en) | Audio object classification based on location metadata | |
EP3198594A1 (fr) | Introduction d'objets sonores dans un signal audio à mixage réducteur | |
WO2019027812A1 (fr) | Classification d'objet audio sur la base de métadonnées de localisation | |
JP2023514121A (ja) | ビデオ情報に基づく空間オーディオ拡張 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18747091 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2018747091 Country of ref document: EP Effective date: 20200302 |