MX2022002587A - Suavizado de metadatos de audio. - Google Patents

Suavizado de metadatos de audio.

Info

Publication number
MX2022002587A
MX2022002587A MX2022002587A MX2022002587A MX2022002587A MX 2022002587 A MX2022002587 A MX 2022002587A MX 2022002587 A MX2022002587 A MX 2022002587A MX 2022002587 A MX2022002587 A MX 2022002587A MX 2022002587 A MX2022002587 A MX 2022002587A
Authority
MX
Mexico
Prior art keywords
audio
metadata
segment
initial
frame
Prior art date
Application number
MX2022002587A
Other languages
English (en)
Inventor
Weiguo Zheng
Rex Ching
Weibo Ni
Kensuke Miyagi
Sean Munday
Teresa Tao
Original Assignee
Netflix Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Netflix Inc filed Critical Netflix Inc
Publication of MX2022002587A publication Critical patent/MX2022002587A/es

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/165Management of the audio stream, e.g. setting of volume, audio stream path
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/23611Insertion of stuffing data into a multiplex stream, e.g. to obtain a constant bitrate
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/167Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0356Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for synchronising with other signals, e.g. video signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion
    • G10L21/055Time compression or expansion for synchronising with other signals, e.g. video signals
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Stereophonic System (AREA)
  • Diaphragms For Electromechanical Transducers (AREA)
  • Amplifiers (AREA)

Abstract

La presente invención se refiere al método implementado por computadora descrito para suavizar las brechas de audio usando metadatos adaptativos que identifica un segmento de audio inicial y un segmento de audio subsecuente que sigue al segmento de audio inicial. El método accede a un primer conjunto de metadatos que corresponde a un último cuadro de audio del segmento de audio inicial y accede a un segundo conjunto de metadatos que corresponde al primer cuadro de audio del segmento de audio subsecuente. Los primero y segundo conjuntos de metadatos incluyen información de características de audio para los dos segmentos de audio. Luego, el método genera un nuevo conjunto de metadatos que se basa en ambos conjuntos de características de audio. El método además inserta un nuevo cuadro de audio entre el último cuadro de audio del segmento de audio inicial y el primer cuadro de audio del segmento de audio subsecuente y aplica el nuevo conjunto de metadatos al nuevo cuadro de audio. También se describen varios otros métodos, sistemas y medios legibles por computadora.
MX2022002587A 2019-09-23 2020-09-22 Suavizado de metadatos de audio. MX2022002587A (es)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201962904542P 2019-09-23 2019-09-23
US15/931,442 US11416208B2 (en) 2019-09-23 2020-05-13 Audio metadata smoothing
PCT/US2020/052017 WO2021061656A1 (en) 2019-09-23 2020-09-22 Audio metadata smoothing

Publications (1)

Publication Number Publication Date
MX2022002587A true MX2022002587A (es) 2022-03-22

Family

ID=74880856

Family Applications (1)

Application Number Title Priority Date Filing Date
MX2022002587A MX2022002587A (es) 2019-09-23 2020-09-22 Suavizado de metadatos de audio.

Country Status (7)

Country Link
US (1) US11416208B2 (es)
EP (1) EP4035402B1 (es)
AU (1) AU2020352977B2 (es)
BR (1) BR112022005474A2 (es)
CA (1) CA3147190A1 (es)
MX (1) MX2022002587A (es)
WO (1) WO2021061656A1 (es)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7375002B2 (ja) * 2019-05-14 2023-11-07 AlphaTheta株式会社 音響装置および楽曲再生プログラム
US11758206B1 (en) * 2021-03-12 2023-09-12 Amazon Technologies, Inc. Encoding media content for playback compatibility

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008001254A1 (en) * 2006-06-26 2008-01-03 Nxp B.V. Method and device for data packing
US9183885B2 (en) * 2008-05-30 2015-11-10 Echostar Technologies L.L.C. User-initiated control of an audio/video stream to skip interstitial content between program segments
US8326127B2 (en) * 2009-01-30 2012-12-04 Echostar Technologies L.L.C. Methods and apparatus for identifying portions of a video stream based on characteristics of the video stream
US8422699B2 (en) * 2009-04-17 2013-04-16 Linear Acoustic, Inc. Loudness consistency at program boundaries
WO2011020065A1 (en) * 2009-08-14 2011-02-17 Srs Labs, Inc. Object-oriented audio streaming system
US8428936B2 (en) * 2010-03-05 2013-04-23 Motorola Mobility Llc Decoder for audio signal including generic audio and speech frames
WO2012122397A1 (en) * 2011-03-09 2012-09-13 Srs Labs, Inc. System for dynamically creating and rendering audio objects
US8924580B2 (en) * 2011-08-12 2014-12-30 Cisco Technology, Inc. Constant-quality rate-adaptive streaming
US8752085B1 (en) * 2012-02-14 2014-06-10 Verizon Patent And Licensing Inc. Advertisement insertion into media content for streaming
US8813120B1 (en) * 2013-03-15 2014-08-19 Google Inc. Interstitial audio control
US20140275851A1 (en) * 2013-03-15 2014-09-18 eagleyemed, Inc. Multi-site data sharing platform
US20150199968A1 (en) * 2014-01-16 2015-07-16 CloudCar Inc. Audio stream manipulation for an in-vehicle infotainment system
KR20240032178A (ko) * 2014-09-12 2024-03-08 소니그룹주식회사 송신 장치, 송신 방법, 수신 장치 및 수신 방법
JP6728154B2 (ja) * 2014-10-24 2020-07-22 ドルビー・インターナショナル・アーベー オーディオ信号のエンコードおよびデコード
CA2978835C (en) * 2015-03-09 2021-01-19 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Fragment-aligned audio coding
GB2581032B (en) * 2015-06-22 2020-11-04 Time Machine Capital Ltd System and method for onset detection in a digital signal
US10341770B2 (en) * 2015-09-30 2019-07-02 Apple Inc. Encoded audio metadata-based loudness equalization and dynamic equalization during DRC
CN108780653B (zh) * 2015-10-27 2020-12-04 扎克·J·沙隆 音频内容制作、音频排序和音频混合的系统和方法
EP3185570A1 (en) * 2015-12-22 2017-06-28 Thomson Licensing Method and apparatus for transmission-based smoothing of rendering
US9880803B2 (en) * 2016-04-06 2018-01-30 International Business Machines Corporation Audio buffering continuity
US11183147B2 (en) * 2016-10-07 2021-11-23 Sony Semiconductor Solutions Corporation Device and method for processing video content for display control
GB2557970B (en) * 2016-12-20 2020-12-09 Mashtraxx Ltd Content tracking system and method

Also Published As

Publication number Publication date
WO2021061656A1 (en) 2021-04-01
EP4035402B1 (en) 2024-05-01
US20210089259A1 (en) 2021-03-25
AU2020352977B2 (en) 2023-06-01
US11416208B2 (en) 2022-08-16
EP4035402A1 (en) 2022-08-03
BR112022005474A2 (pt) 2022-06-14
AU2020352977A1 (en) 2022-02-24
CA3147190A1 (en) 2021-04-01

Similar Documents

Publication Publication Date Title
US11935548B2 (en) Multi-channel signal encoding method and encoder
PH12019500771A1 (en) Business processing method and apparatus
EP3743831A4 (en) PERTINENCE CALCULATION PROCESS, APPARATUS FOR CALCULATING RELEVANCE, DATA INTERROGATION APPARATUS AND COMPUTER-READABLE NON-TRANSITIONAL INFORMATION MEDIA
WO2020035085A3 (en) System and method for determining voice characteristics
MX2022002587A (es) Suavizado de metadatos de audio.
WO2019143737A8 (en) Systems and methods for modeling probability distributions
EP3736806A4 (en) AUDIO SYNTHETIZATION PROCESS, STORAGE MEDIUM AND COMPUTER EQUIPMENT
EP3828885A4 (en) VOICE DISRUPTION PROCESS AND APPARATUS, COMPUTER DEVICE AND COMPUTER READABLE STORAGE MEDIA
SG10201707702YA (en) Collaborative Voice Controlled Devices
EP3791387A4 (en) SYSTEMS AND METHODS FOR ENHANCED SPEECH RECOGNITION USING NEUROMUSCULAR INFORMATION
WO2019228563A3 (en) System and method for digital asset management
CA2902821C (en) System for metadata management
GB2613507A (en) Dual-modality relation networks for audio-visual event localization
EP4219071A3 (en) Methods and apparatus to compensate impression data for misattribution and/or non-coverage by a database proprietor
MX340027B (es) Presentacion de acciones y proveedores asociados con entidades.
EP4333461A3 (en) Improved rendering of immersive audio content
GB2587942A (en) Layered stochastic anonymization of data
MX2022005322A (es) Sistema de simulacion de paginas.
EP3774987A4 (en) POLYPROPIOLACTONE FILMS AND METHOD FOR MANUFACTURING THEREOF
EP4273817A3 (en) Method for occupying device and electronic device
EP3944231A4 (en) VOICE RECOGNITION DEVICES AND METHOD OF WAKE-UP RESPONSE THEREOF, AND COMPUTER STORAGE MEDIA
EP3759629B8 (en) Method, entity and system for managing access to data through a late dynamic binding of its associated metadata
MX2019008957A (es) Señalizacion de parametros de control de flujo de qos.
MX2019015132A (es) Metodo de control de acceso y producto relacionado.
EP3956850A4 (en) SYSTEMS AND METHODS FOR DEDUCTING PLATFORM FEEDBACK DATA FOR DYNAMIC RETRIEVAL BY DOWNSTREAM SUBSYSTEMS