CN114051194A - Audio track metadata and generation method, electronic equipment and storage medium - Google Patents
Audio track metadata and generation method, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN114051194A CN114051194A CN202111204386.4A CN202111204386A CN114051194A CN 114051194 A CN114051194 A CN 114051194A CN 202111204386 A CN202111204386 A CN 202111204386A CN 114051194 A CN114051194 A CN 114051194A
- Authority
- CN
- China
- Prior art keywords
- audio
- audio track
- format
- metadata
- track
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 19
- 230000015654 memory Effects 0.000 claims description 18
- 238000009877 rendering Methods 0.000 abstract description 2
- 238000004519 manufacturing process Methods 0.000 description 12
- 230000006870 function Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 230000005236 sound signal Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/68—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/683—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/68—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/686—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title or artist information, time, location or usage information, user ratings
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/04—Circuit arrangements, e.g. for selective connection of amplifier inputs/outputs to loudspeakers, for loudspeaker detection, or for adaptation of settings to personal preferences or hearing impairments
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Library & Information Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The present disclosure relates to an audio track metadata and generation method, an electronic device, and a storage medium. The audio track metadata includes: the attribute area comprises an audio track name, an audio track identifier and audio track format description information; and a sub-element area including audio stream format reference information. The audio data can realize the reproduction of three-dimensional sound in the space during rendering, thereby improving the quality of sound scenes.
Description
Technical Field
The present disclosure relates to the field of audio processing technologies, and in particular, to an audio track metadata and generation method, an electronic device, and a storage medium.
Background
With the development of technology, audio becomes more and more complex. The early single-channel audio is converted into stereo, and the working center also focuses on the correct processing mode of the left and right channels. But the process begins to become complex after surround sound occurs. The surround 5.1 speaker system performs ordering constraint on a plurality of channels, and further the surround 6.1 speaker system, the surround 7.1 speaker system and the like enable audio processing to be varied, and correct signals are transmitted to proper speakers to form an effect of mutual involvement. Thus, as sound becomes more immersive and interactive, the complexity of audio processing also increases greatly.
Audio channels (or audio channels) refer to audio signals that are independent of each other and that are captured or played back at different spatial locations when sound is recorded or played. The number of channels is the number of sound sources when recording or the number of corresponding speakers when playing back sound. For example, in a surround 5.1 speaker system comprising audio signals at 6 different spatial locations, each separate audio signal is used to drive a speaker at a corresponding spatial location; in a surround 7.1 speaker system comprising audio signals at 8 different spatial positions, each separate audio signal is used to drive a speaker at a corresponding spatial position.
Therefore, the effect achieved by current loudspeaker systems depends on the number and spatial position of the loudspeakers. For example, a binaural speaker system cannot achieve the effect of a surround 5.1 speaker system.
The present disclosure provides audio track metadata and a construction method thereof in order to provide metadata capable of solving the above technical problems.
Disclosure of Invention
The present disclosure is directed to an audio track metadata generation method, an electronic device, and a storage medium, so as to solve one of the above technical problems.
To achieve the above object, a first aspect of the present disclosure provides audio track metadata, including:
the attribute area comprises an audio track name, an audio track identifier and audio track format description information;
and a sub-element area including audio stream format reference information.
To achieve the above object, a second aspect of the present disclosure provides a method for generating audio track metadata, including:
the generating comprises audio track metadata as described in the first aspect.
To achieve the above object, a third aspect of the present disclosure provides an electronic device, including: a memory and one or more processors;
the memory for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to generate a stream comprising audio track metadata as described in the first aspect.
To achieve the above object, a fourth aspect of the present disclosure provides a storage medium containing computer-executable instructions which, when generated by a computer processor, comprise audio track metadata as described in the first aspect.
As can be seen from the above, the disclosed audio track metadata, which is used to describe the format of the audio data, allows the renderer to correctly decode the signal. So as to realize the reproduction of three-dimensional sound in space, thereby improving the quality of sound scenes.
Drawings
Fig. 1 is a schematic diagram of a three-dimensional acoustic audio production model provided in embodiment 1 of the present disclosure;
fig. 2 is a flowchart of a method for generating audio track metadata provided in embodiment 2 of the present disclosure;
fig. 3 is a schematic structural diagram of an electronic device provided in embodiment 3 of the present disclosure.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
As shown in fig. 1, a three-dimensional audio production model is composed of a set of production elements each describing one stage of audio production, and includes a content production section and a format production section.
Wherein the content production section includes: an audio program element, an audio content element, an audio object element, and a soundtrack unique identification element; the format making part includes: an audio packet format element, an audio channel format element, an audio stream format element, and an audio track format element;
the audio program element references at least one of the audio content elements; the audio content element references at least one audio object element; the audio object element references the corresponding audio package format element and the corresponding audio track unique identification element; the audio track unique identification element refers to the corresponding audio track format element and the corresponding audio package format element;
the audio package format element references at least one of the audio channel format elements; the audio stream format element refers to the corresponding audio channel format element and the corresponding audio packet format element; the audio track format element and the corresponding audio stream format element are referenced to each other. The reference relationships between elements are indicated by arrows in fig. 1.
The audio program may include, but is not limited to, narration, sound effects, and background music, the audio program elements may be used to describe a program, the program includes at least one content, and the audio content elements are used to describe a corresponding one of the audio program elements. An audio program element may reference one or more audio content elements that are grouped together to construct a complete audio program element.
The audio content elements describe the content of a component of an audio program, such as background music, and relate the content to its format by reference to one or more audio object elements.
The audio object elements are used to build content, format and valuable information and to determine the soundtrack unique identification of the actual soundtrack.
The audio packet format element may be configured to describe a format adopted when the audio object element and the original audio data are packed according to channel packets.
Stream, is a combination of audio tracks needed to render a channel, object, higher order ambient sound component, or packet. The audio stream format establishes a relationship between a set of audio track formats and a set of audio channel formats or audio packet formats.
The audio channel format element may be used to represent a single sequence of audio samples and preset operations performed on it, such as movement of rendering objects in a scene. The audio channel format element may comprise at least one audio block format element. The audio block format elements may be considered to be sub-elements of the audio channel format elements, and therefore there is an inclusion relationship between the audio channel format elements and the audio block format elements.
And generating synthetic audio data containing metadata after the original audio data is produced through the three-dimensional sound audio production model.
The Metadata (Metadata) is information describing characteristics of data, and functions supported by the Metadata include indicating a storage location, history data, resource lookup, or file record.
And after the synthesized audio data is transmitted to the far end in a communication mode, the far end renders the synthesized audio data based on the metadata to restore the original sound scene.
The division between content production, format production and BW64(Broadcast Wave 64 bit) files is shown in fig. 1. Both the content production portion and the format production portion constitute metadata in XML format, which is typically contained in one block ("axml" block) of the BW64 file. The bottom BW64 file portion contains a "channel allocation (chna)" block, which is a look-up table used to link metadata to the audio programs in the file.
Example 1
The present disclosure provides and describes in detail audio track metadata in a three-dimensional acoustic audio model.
An audio track format element corresponds to a set of samples or data in a single audio track in a storage medium. It is used to describe the format of the data, allowing the renderer to decode the signal correctly. It comes from an audio stream format element that identifies the combination of audio tracks needed to successfully decode the audio track data.
The audio track metadata includes:
the attribute area comprises an audio track name, an audio track identifier and audio track format description information;
and a sub-element area including audio stream format reference information.
Wherein the attribute area includes a generic definition of audio track metadata. The audio track name may be a name set for the audio track, and the user may determine the audio track by the audio track name. The audio track identification is an identification symbol of the audio track. The audio track format description information may include a format tag and/or a format definition, which may be employed to represent a type of track corresponding to an encoding format of audio described by the audio track. The format definition of an audio stream format, which may include both PCM audio coding formats and non-PCM audio coding formats, specifies the coding format of the audio it describes. Pulse Code Modulation (PCM) is an encoding format of audio data. The format tags may be numerical codes and each stream type may have a corresponding numerical code representation. For example, a PCM type stream is denoted by 0001.
For PCM audio, the audio stream format will refer to a single audio track format, so the two elements, audio track format and audio stream format, effectively describe the same thing. For non-PCM audio, multiple audio track formats must be combined in one audio stream format to generate decodable data.
The software that parses the model may begin with an audio track format or an audio stream format. To achieve this flexibility, the audio track format may also refer to an audio stream format. However, it is strictly required that if the audio track format uses this reference, the audio stream format must reference back the audio track format that refers to it. The audio stream format reference information may include an audio stream identification indicating the audio stream format to which the audio track format refers.
The audio track identification may include: an audio stream identification indicating an audio stream to which an audio track belongs and a number indicating the audio track in the audio stream. Where for non-PCM audio multiple audio track formats have to be combined in one audio stream format to generate decodable data. Therefore, one audio stream contains a plurality of audio tracks, and the audio track identification needs to contain the number of the audio track in the corresponding audio stream. It will be appreciated that for PCM audio, the audio stream format will refer to a single audio track format, and the audio track number in the audio stream is unique within the audio track identification of that audio track format, and may be 01, for example.
Alternatively, the audio track identification may comprise a set of 8-bit hexadecimal digits and a set of 2-bit hexadecimal digits. The first four digits of the 8-bit hexadecimal number represent the type of audio contained in the audio track, and the last four digits represent the corresponding audio stream format. For example, the audio track is identified as AT _ yyyxxxx _ nn, yyyy denotes the type of audio contained in the track, xxxx matches the number of the audio stream identification in the audio stream format, and the nn number denotes the number of the audio track in the stream (which may start with 01). The attribute area includes information as shown in table 1,
TABLE 1
In table 1, the requirement item means whether the item attribute needs to be set when generating the audio track metadata, yes indicates that the item attribute is a necessary item, optional indicates that the item attribute is an optional item, and at least one of the format definition and the format tag needs to be set.
The sub-element region includes information as shown in table 2,
TABLE 2
The number one entry in table 2 indicates the number of sub-elements that can be set, the audio track format can refer to the audio stream format, and the number of audiostream format idref is 1.
Example 2
The present disclosure also provides an embodiment of a method for adapting to the above-mentioned embodiment, which is used for a method for generating audio track metadata, and the explanation based on the same name meaning is the same as that of the above-mentioned embodiment, and has the same technical effect as that of the above-mentioned embodiment, and details are not repeated here.
A method for generating audio track metadata, as shown in fig. 2, comprising the steps of:
step S110, in response to a setting operation of a user for audio track metadata, generating audio track metadata, where the audio track metadata includes:
the attribute area comprises an audio track name, an audio track identifier and audio track format description information;
and a sub-element area including audio stream format reference information.
The setting operation of the user for the audio track metadata may be an operation of setting by the user for a relevant attribute of the audio track metadata, for example, receiving the relevant attribute of the audio track metadata input item by the user; or, automatically generating audio track metadata according to the operation of a user on a preset metadata generation program, where the preset metadata generation program may be set to set all attributes of the audio track metadata according to default attributes of the system; alternatively, the audio track metadata may be automatically generated according to a user's operation on a preset metadata generation program, which may be configured to set a partial attribute of the audio track metadata according to a system default attribute and then receive a remaining attribute input by the user.
Optionally, the audio track identification includes: an audio stream identification indicating an audio stream to which an audio track belongs and a number indicating the audio track in the audio stream.
Optionally, the audio track format description information includes a format tag and/or a format definition.
Optionally, the format definition includes a PCM audio encoding format and a non-PCM audio encoding format.
Optionally, the audio stream format reference information includes an audio stream identifier.
For example, the method for setting the audio track metadata may adopt the following encoding mode:
<audioTrackFormat audioTrackFormatID="AT_00010001_01"
audioTrackFormatName="PCM_FrontLeft"
formatDefinition="PCM"formatLabel="0001">
<audioStreamFormatIDRef>AS_00010001</audioStreamFormatID Ref>
</audioTrackFormat>
the audio track metadata generated by the method for generating the audio track metadata describes the format of data, allows a renderer to correctly decode signals, and can realize the reproduction of three-dimensional sound in space, so that the quality of sound scenes is improved.
Example 3
Fig. 3 is a schematic structural diagram of an electronic device provided in embodiment 3 of the present disclosure. As shown in fig. 3, the electronic apparatus includes: a processor 30, a memory 31, an input device 32, and an output device 33. The number of the processors 30 in the electronic device may be one or more, and one processor 30 is taken as an example in fig. 3. The number of the memories 31 in the electronic device may be one or more, and one memory 31 is taken as an example in fig. 3. The processor 30, the memory 31, the input device 32 and the output device 33 of the electronic apparatus may be connected by a bus or other means, and fig. 3 illustrates the connection by a bus as an example. The electronic device can be a computer, a server and the like. The embodiment of the present disclosure describes in detail by taking an electronic device as a server, and the server may be an independent server or a cluster server.
The input device 32 may be used to receive input numeric or character information and generate key signal inputs related to viewer user settings and function controls of the electronic device, as well as a camera for capturing images and a sound pickup device for capturing audio data. The output device 33 may include an audio device such as a speaker. It should be noted that the specific composition of the input device 32 and the output device 33 can be set according to actual conditions.
The processor 30 executes various functional applications of the device and data processing, i.e. generating audio track metadata, by running software programs, instructions and modules stored in the memory 31.
Example 4
The disclosed embodiment 4 also provides a storage medium containing computer executable instructions which, when generated by a computer processor, include audio track metadata as described in embodiment 1.
Of course, the storage medium provided by the embodiments of the present disclosure contains computer-executable instructions, and the computer-executable instructions are not limited to the operations of the electronic method described above, and may also perform related operations in the electronic method provided by any embodiments of the present disclosure, and have corresponding functions and advantages.
From the above description of the embodiments, it is obvious for a person skilled in the art that the present disclosure can be implemented by software and necessary general hardware, and certainly can be implemented by hardware, but in many cases, the former is a better embodiment. Based on such understanding, the technical solutions of the present disclosure may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a robot, a personal computer, a server, or a network device) to execute the electronic method according to any embodiment of the present disclosure.
It should be noted that, in the electronic device, the units and modules included in the electronic device are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be realized; in addition, specific names of the functional units are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the present disclosure.
It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "in an embodiment," "in yet another embodiment," "exemplary" or "in a particular embodiment," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the disclosure. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although the present disclosure has been described in detail hereinabove with respect to general description, specific embodiments and experiments, it will be apparent to those skilled in the art that some modifications or improvements may be made based on the present disclosure. Accordingly, such modifications and improvements are intended to be within the scope of this disclosure, as claimed.
Claims (8)
1. Audio track metadata, comprising:
the attribute area comprises an audio track name, an audio track identifier and audio track format description information;
and a sub-element area including audio stream format reference information.
2. The audio track metadata according to claim 1, wherein the audio track identification comprises: an audio stream identification indicating an audio stream to which an audio track belongs and a number indicating the audio track in the audio stream.
3. Audio track metadata according to claim 1, characterized in that said audio track format description information comprises format tags and/or format definitions.
4. The audio track metadata according to claim 3, wherein the format definition comprises a PCM audio encoding format and a non-PCM audio encoding format.
5. The audio track metadata according to claim 1, wherein said audio stream format reference information comprises an audio stream identification.
6. A method of generating audio track metadata, arranged to generate metadata comprising audio tracks as claimed in any of claims 1 to 5.
7. An electronic device, comprising: a memory and one or more processors;
the memory for storing one or more programs;
when executed by the one or more programs, cause the one or more processors to generate audio tracks comprising the audio track metadata of any of claims 1-5.
8. A storage medium containing computer-executable instructions which, when generated by a computer processor, comprise audio track metadata according to any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111204386.4A CN114051194A (en) | 2021-10-15 | 2021-10-15 | Audio track metadata and generation method, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111204386.4A CN114051194A (en) | 2021-10-15 | 2021-10-15 | Audio track metadata and generation method, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114051194A true CN114051194A (en) | 2022-02-15 |
Family
ID=80205219
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111204386.4A Pending CN114051194A (en) | 2021-10-15 | 2021-10-15 | Audio track metadata and generation method, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114051194A (en) |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101111894A (en) * | 2005-01-25 | 2008-01-23 | 尼禄股份公司 | Method for preparing dvd-video formatted data, method for reconstructing dvd-video data and dvd-video data structure |
CN101777370A (en) * | 2004-07-02 | 2010-07-14 | 苹果公司 | Universal container for audio data |
CN101802823A (en) * | 2007-08-20 | 2010-08-11 | 诺基亚公司 | Segmented metadata and indexes for streamed multimedia data |
CN102246491A (en) * | 2008-10-08 | 2011-11-16 | 诺基亚公司 | System and method for storing multi-source multimedia presentations |
US20110282650A1 (en) * | 2010-05-17 | 2011-11-17 | Avaya Inc. | Automatic normalization of spoken syllable duration |
WO2013006342A1 (en) * | 2011-07-01 | 2013-01-10 | Dolby Laboratories Licensing Corporation | Synchronization and switchover methods and systems for an adaptive audio system |
WO2013182901A1 (en) * | 2012-06-07 | 2013-12-12 | Actiwave Ab | Non-linear control of loudspeakers |
US20140123006A1 (en) * | 2012-10-25 | 2014-05-01 | Apple Inc. | User interface for streaming media stations with flexible station creation |
WO2018040102A1 (en) * | 2016-09-05 | 2018-03-08 | 华为技术有限公司 | Audio processing method and device |
CN109273014A (en) * | 2015-03-13 | 2019-01-25 | 杜比国际公司 | Decode the audio bit stream with the frequency spectrum tape copy metadata of enhancing |
US20190265943A1 (en) * | 2018-02-23 | 2019-08-29 | Bose Corporation | Content based dynamic audio settings |
CN110600043A (en) * | 2013-06-19 | 2019-12-20 | 杜比实验室特许公司 | Audio processing unit, method executed by audio processing unit, and storage medium |
CN111542806A (en) * | 2017-10-12 | 2020-08-14 | 弗劳恩霍夫应用研究促进协会 | Method and apparatus for efficient delivery and use of high quality of experience audio messages |
US20210050028A1 (en) * | 2018-01-26 | 2021-02-18 | Lg Electronics Inc. | Method for transmitting and receiving audio data and apparatus therefor |
WO2021047820A1 (en) * | 2019-09-13 | 2021-03-18 | Nokia Technologies Oy | An apparatus, a method and a computer program for video coding and decoding |
CN112735445A (en) * | 2020-12-25 | 2021-04-30 | 广州朗国电子科技有限公司 | Method, apparatus and storage medium for adaptively selecting audio track |
-
2021
- 2021-10-15 CN CN202111204386.4A patent/CN114051194A/en active Pending
Patent Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101777370A (en) * | 2004-07-02 | 2010-07-14 | 苹果公司 | Universal container for audio data |
CN101111894A (en) * | 2005-01-25 | 2008-01-23 | 尼禄股份公司 | Method for preparing dvd-video formatted data, method for reconstructing dvd-video data and dvd-video data structure |
CN101802823A (en) * | 2007-08-20 | 2010-08-11 | 诺基亚公司 | Segmented metadata and indexes for streamed multimedia data |
CN102246491A (en) * | 2008-10-08 | 2011-11-16 | 诺基亚公司 | System and method for storing multi-source multimedia presentations |
US20110282650A1 (en) * | 2010-05-17 | 2011-11-17 | Avaya Inc. | Automatic normalization of spoken syllable duration |
WO2013006342A1 (en) * | 2011-07-01 | 2013-01-10 | Dolby Laboratories Licensing Corporation | Synchronization and switchover methods and systems for an adaptive audio system |
CN103621101A (en) * | 2011-07-01 | 2014-03-05 | 杜比实验室特许公司 | Synchronization and switchover methods and systems for an adaptive audio system |
WO2013182901A1 (en) * | 2012-06-07 | 2013-12-12 | Actiwave Ab | Non-linear control of loudspeakers |
US20140123006A1 (en) * | 2012-10-25 | 2014-05-01 | Apple Inc. | User interface for streaming media stations with flexible station creation |
CN110600043A (en) * | 2013-06-19 | 2019-12-20 | 杜比实验室特许公司 | Audio processing unit, method executed by audio processing unit, and storage medium |
CN109273014A (en) * | 2015-03-13 | 2019-01-25 | 杜比国际公司 | Decode the audio bit stream with the frequency spectrum tape copy metadata of enhancing |
WO2018040102A1 (en) * | 2016-09-05 | 2018-03-08 | 华为技术有限公司 | Audio processing method and device |
CN111542806A (en) * | 2017-10-12 | 2020-08-14 | 弗劳恩霍夫应用研究促进协会 | Method and apparatus for efficient delivery and use of high quality of experience audio messages |
US20210050028A1 (en) * | 2018-01-26 | 2021-02-18 | Lg Electronics Inc. | Method for transmitting and receiving audio data and apparatus therefor |
US20190265943A1 (en) * | 2018-02-23 | 2019-08-29 | Bose Corporation | Content based dynamic audio settings |
WO2021047820A1 (en) * | 2019-09-13 | 2021-03-18 | Nokia Technologies Oy | An apparatus, a method and a computer program for video coding and decoding |
CN112735445A (en) * | 2020-12-25 | 2021-04-30 | 广州朗国电子科技有限公司 | Method, apparatus and storage medium for adaptively selecting audio track |
Non-Patent Citations (2)
Title |
---|
国际电信联盟: "音频定义模型", 《ITU-RBS.2076-1建议书 》 * |
段卓骏: "广播系统流媒体传输的研究与实现", 《中国优秀硕士学位论文全文数据库-信息科技辑》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6174326B2 (en) | Acoustic signal generating device and acoustic signal reproducing device | |
CN114023339A (en) | Audio-bed-based audio packet format metadata and generation method, device and medium | |
CN111164679B (en) | Encoding device and method, decoding device and method, and program | |
CN113905321A (en) | Object-based audio channel metadata and generation method, device and storage medium | |
CN104506920A (en) | Method and device for playing omnimedia data information | |
CN114203189A (en) | Method, apparatus and medium for generating metadata based on binaural audio packet format | |
CN114023340A (en) | Object-based audio packet format metadata and generation method, apparatus, and medium | |
CN114979935A (en) | Object output rendering item determination method, device, equipment and storage medium | |
CN114203190A (en) | Matrix-based audio packet format metadata and generation method, device and storage medium | |
CN114051194A (en) | Audio track metadata and generation method, electronic equipment and storage medium | |
CN114143695A (en) | Audio stream metadata and generation method, electronic equipment and storage medium | |
CN114512152A (en) | Method, device and equipment for generating broadcast audio format file and storage medium | |
US20090088879A1 (en) | Audio reproduction device and method for audio reproduction | |
CN114121036A (en) | Audio track unique identification metadata and generation method, electronic device and storage medium | |
CN113905322A (en) | Method, device and storage medium for generating metadata based on binaural audio channel | |
CN115190412A (en) | Method, device and equipment for generating internal data structure of renderer and storage medium | |
CN114360556A (en) | Serial audio metadata frame generation method, device, equipment and storage medium | |
CN113923264A (en) | Scene-based audio channel metadata and generation method, device and storage medium | |
CN113923584A (en) | Matrix-based audio channel metadata and generation method, equipment and storage medium | |
CN113889128A (en) | Audio production model and generation method, electronic equipment and storage medium | |
CN114530157A (en) | Audio metadata channel allocation block generation method, apparatus, device and medium | |
CN113938811A (en) | Audio channel metadata based on sound bed, generation method, equipment and storage medium | |
CN115426611A (en) | Method and apparatus for rendering object-based audio using metadata | |
CN114363790A (en) | Method, apparatus, device and medium for generating metadata of serial audio block format | |
CN115209310A (en) | Method and device for rendering sound bed-based audio by using metadata |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20220215 |
|
RJ01 | Rejection of invention patent application after publication |