CN113168839B - 双端媒体智能 - Google Patents
双端媒体智能 Download PDFInfo
- Publication number
- CN113168839B CN113168839B CN201980080866.9A CN201980080866A CN113168839B CN 113168839 B CN113168839 B CN 113168839B CN 201980080866 A CN201980080866 A CN 201980080866A CN 113168839 B CN113168839 B CN 113168839B
- Authority
- CN
- China
- Prior art keywords
- audio content
- content
- classification information
- control weights
- virtualizer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/167—Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/26—Pre-filtering or post-filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/20—Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/65—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/68—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/002—Dynamic bit allocation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/22—Mode decision, i.e. based on audio signal content versus external parameters
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Signal Processing (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Mathematical Physics (AREA)
- Stereophonic System (AREA)
Applications Claiming Priority (7)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CNPCT/CN2018/120923 | 2018-12-13 | ||
| CN2018120923 | 2018-12-13 | ||
| US201962792997P | 2019-01-16 | 2019-01-16 | |
| US62/792,997 | 2019-01-16 | ||
| EP19157080.3 | 2019-02-14 | ||
| EP19157080 | 2019-02-14 | ||
| PCT/US2019/065338 WO2020123424A1 (en) | 2018-12-13 | 2019-12-10 | Dual-ended media intelligence |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN113168839A CN113168839A (zh) | 2021-07-23 |
| CN113168839B true CN113168839B (zh) | 2024-01-23 |
Family
ID=69104844
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201980080866.9A Active CN113168839B (zh) | 2018-12-13 | 2019-12-10 | 双端媒体智能 |
Country Status (8)
| Country | Link |
|---|---|
| US (1) | US12469500B2 (https=) |
| EP (1) | EP3895164B1 (https=) |
| JP (2) | JP7455836B2 (https=) |
| KR (1) | KR20210102899A (https=) |
| CN (1) | CN113168839B (https=) |
| BR (1) | BR112021009667A2 (https=) |
| RU (1) | RU2768224C1 (https=) |
| WO (1) | WO2020123424A1 (https=) |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2023539121A (ja) * | 2020-08-18 | 2023-09-13 | ドルビー ラボラトリーズ ライセンシング コーポレイション | オーディオコンテンツの識別 |
| WO2022115303A1 (en) | 2020-11-27 | 2022-06-02 | Dolby Laboratories Licensing Corporation | Automatic generation and selection of target profiles for dynamic equalization of audio content |
| CN115102931B (zh) * | 2022-05-20 | 2023-12-19 | 阿里巴巴(中国)有限公司 | 自适应调整音频延迟的方法及电子设备 |
| CN116723438A (zh) * | 2023-05-26 | 2023-09-08 | 三星电子(中国)研发中心 | 修正参数生成方法和装置 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2010003521A1 (en) * | 2008-07-11 | 2010-01-14 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Method and discriminator for classifying different segments of a signal |
| CN102687198A (zh) * | 2009-12-07 | 2012-09-19 | 杜比实验室特许公司 | 使用自适应混合变换的多声道音频编码比特流的解码 |
| JP2014132439A (ja) * | 2013-01-03 | 2014-07-17 | Mitsubishi Electric Corp | 入力信号を符号化する方法 |
| CN104240709A (zh) * | 2013-06-19 | 2014-12-24 | 杜比实验室特许公司 | 使用节目信息或子流结构元数据的音频编码器和解码器 |
Family Cites Families (32)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6360234B2 (en) | 1997-08-14 | 2002-03-19 | Virage, Inc. | Video cataloger system with synchronized encoders |
| US6833865B1 (en) | 1998-09-01 | 2004-12-21 | Virage, Inc. | Embedded metadata engines in digital capture devices |
| CN1284104C (zh) | 2001-05-15 | 2006-11-08 | 皇家菲利浦电子有限公司 | 内容分析设备 |
| US7454331B2 (en) * | 2002-08-30 | 2008-11-18 | Dolby Laboratories Licensing Corporation | Controlling loudness of speech in signals that contain speech and other types of audio material |
| US7895138B2 (en) * | 2004-11-23 | 2011-02-22 | Koninklijke Philips Electronics N.V. | Device and a method to process audio data, a computer program element and computer-readable medium |
| JP4713396B2 (ja) | 2006-05-09 | 2011-06-29 | シャープ株式会社 | 映像音声再生装置、及びその音像移動方法 |
| US8121198B2 (en) | 2006-10-16 | 2012-02-21 | Microsoft Corporation | Embedding content-based searchable indexes in multimedia files |
| US7640272B2 (en) | 2006-12-07 | 2009-12-29 | Microsoft Corporation | Using automated content analysis for audio/video content consumption |
| CA2645915C (en) | 2007-02-14 | 2012-10-23 | Lg Electronics Inc. | Methods and apparatuses for encoding and decoding object-based audio signals |
| US20080208589A1 (en) * | 2007-02-27 | 2008-08-28 | Cross Charles W | Presenting Supplemental Content For Digital Media Using A Multimodal Application |
| US20100138890A1 (en) | 2007-05-07 | 2010-06-03 | Nxp B.V. | Device to allow content analysis in real time |
| EP2144230A1 (en) * | 2008-07-11 | 2010-01-13 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Low bitrate audio encoding/decoding scheme having cascaded switches |
| US8965545B2 (en) * | 2010-09-30 | 2015-02-24 | Google Inc. | Progressive encoding of audio |
| TWI581250B (zh) * | 2010-12-03 | 2017-05-01 | 杜比實驗室特許公司 | 利用多媒體處理節點之適應性處理技術 |
| KR102185941B1 (ko) * | 2011-07-01 | 2020-12-03 | 돌비 레버러토리즈 라이쎈싱 코오포레이션 | 적응형 오디오 신호 생성, 코딩 및 렌더링을 위한 시스템 및 방법 |
| US20140056430A1 (en) * | 2012-08-21 | 2014-02-27 | Electronics And Telecommunications Research Institute | System and method for reproducing wave field using sound bar |
| US9805725B2 (en) | 2012-12-21 | 2017-10-31 | Dolby Laboratories Licensing Corporation | Object clustering for rendering object-based audio content based on perceptual criteria |
| CN112652316B (zh) | 2013-01-21 | 2023-09-15 | 杜比实验室特许公司 | 利用响度处理状态元数据的音频编码器和解码器 |
| ES2628153T3 (es) | 2013-01-28 | 2017-08-01 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Procedimiento y aparato para una reproducción de audio normalizada de un contenido multimedia con y sin metadatos incorporados de volumen sonoro en nuevos dispositivos multimedia |
| US9609452B2 (en) | 2013-02-08 | 2017-03-28 | Qualcomm Incorporated | Obtaining sparseness information for higher order ambisonic audio renderers |
| US8903186B2 (en) | 2013-02-28 | 2014-12-02 | Facebook, Inc. | Methods and systems for differentiating synthetic and non-synthetic images |
| CN104080024B (zh) * | 2013-03-26 | 2019-02-19 | 杜比实验室特许公司 | 音量校平器控制器和控制方法以及音频分类器 |
| CN104078050A (zh) * | 2013-03-26 | 2014-10-01 | 杜比实验室特许公司 | 用于音频分类和音频处理的设备和方法 |
| US9559651B2 (en) * | 2013-03-29 | 2017-01-31 | Apple Inc. | Metadata for loudness and dynamic range control |
| US9418650B2 (en) | 2013-09-25 | 2016-08-16 | Verizon Patent And Licensing Inc. | Training speech recognition using captions |
| WO2016018787A1 (en) * | 2014-07-31 | 2016-02-04 | Dolby Laboratories Licensing Corporation | Audio processing systems and methods |
| US10110911B2 (en) | 2014-11-11 | 2018-10-23 | Cisco Technology, Inc. | Parallel media encoding |
| US10834436B2 (en) | 2015-05-27 | 2020-11-10 | Arris Enterprises Llc | Video classification using user behavior from a network digital video recorder |
| US9837086B2 (en) * | 2015-07-31 | 2017-12-05 | Apple Inc. | Encoded audio extended metadata-based dynamic range control |
| US9934790B2 (en) | 2015-07-31 | 2018-04-03 | Apple Inc. | Encoded audio metadata-based equalization |
| US9934785B1 (en) | 2016-11-30 | 2018-04-03 | Spotify Ab | Identification of taste attributes from an audio signal |
| JP7086521B2 (ja) | 2017-02-27 | 2022-06-20 | ヤマハ株式会社 | 情報処理方法および情報処理装置 |
-
2019
- 2019-12-10 KR KR1020217017682A patent/KR20210102899A/ko not_active Withdrawn
- 2019-12-10 US US17/312,011 patent/US12469500B2/en active Active
- 2019-12-10 BR BR112021009667-1A patent/BR112021009667A2/pt unknown
- 2019-12-10 WO PCT/US2019/065338 patent/WO2020123424A1/en not_active Ceased
- 2019-12-10 CN CN201980080866.9A patent/CN113168839B/zh active Active
- 2019-12-10 RU RU2021116055A patent/RU2768224C1/ru active
- 2019-12-10 JP JP2021532235A patent/JP7455836B2/ja active Active
- 2019-12-10 EP EP19831966.7A patent/EP3895164B1/en active Active
-
2024
- 2024-03-13 JP JP2024038518A patent/JP2024081674A/ja active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2010003521A1 (en) * | 2008-07-11 | 2010-01-14 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Method and discriminator for classifying different segments of a signal |
| CN102687198A (zh) * | 2009-12-07 | 2012-09-19 | 杜比实验室特许公司 | 使用自适应混合变换的多声道音频编码比特流的解码 |
| JP2014132439A (ja) * | 2013-01-03 | 2014-07-17 | Mitsubishi Electric Corp | 入力信号を符号化する方法 |
| CN104240709A (zh) * | 2013-06-19 | 2014-12-24 | 杜比实验室特许公司 | 使用节目信息或子流结构元数据的音频编码器和解码器 |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2020123424A1 (en) | 2020-06-18 |
| RU2768224C1 (ru) | 2022-03-23 |
| JP7455836B2 (ja) | 2024-03-26 |
| US12469500B2 (en) | 2025-11-11 |
| EP3895164B1 (en) | 2022-09-07 |
| KR20210102899A (ko) | 2021-08-20 |
| EP3895164A1 (en) | 2021-10-20 |
| US20220059102A1 (en) | 2022-02-24 |
| BR112021009667A2 (pt) | 2021-08-17 |
| CN113168839A (zh) | 2021-07-23 |
| JP2022513184A (ja) | 2022-02-07 |
| JP2024081674A (ja) | 2024-06-18 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN113168839B (zh) | 双端媒体智能 | |
| KR102686742B1 (ko) | 객체 기반 오디오 신호 균형화 | |
| CN110890101B (zh) | 用于基于语音增强元数据进行解码的方法和设备 | |
| JP5001384B2 (ja) | オーディオ信号の処理方法及び装置 | |
| CN105814630B (zh) | 用于音频设备的组合动态范围压缩和引导截断防止的构思 | |
| CN102768835B (zh) | 用于编码和解码具有各种声道的多对象音频信号的设备和方法 | |
| KR101049144B1 (ko) | 오디오 신호 처리방법 및 장치 | |
| KR101100221B1 (ko) | 오디오 신호의 디코딩 방법 및 그 장치 | |
| US8634577B2 (en) | Audio decoder | |
| US7970144B1 (en) | Extracting and modifying a panned source for enhancement and upmix of audio signals | |
| IL266580A (en) | Method and apparatus for adaptive control of decorrelation filters | |
| US11463833B2 (en) | Method and apparatus for voice or sound activity detection for spatial audio | |
| CN112567765A (zh) | 空间音频捕获、传输和再现 | |
| EP3903309B1 (en) | High resolution audio coding | |
| KR20250145065A (ko) | 음성대화 명료도 향상 방법 및 시스템 | |
| CN113348507A (zh) | 高分辨率音频编解码 | |
| HK40126637A (zh) | 基於对象的音频编解码器中不连续传输的方法和设备 | |
| JP2026503560A (ja) | オーディオコーデックにおけるフレキシブルな複合フォーマットビットレート適応のための方法およびデバイス | |
| HK40097496A (zh) | 用於音频编解码器中的音频带宽检测和音频带宽切换的方法和设备 | |
| HK1215489B (zh) | 用於响度和动态范围控制的元数据 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |