WO2022021898A1 - 音频处理方法、装置、系统以及存储介质 - Google Patents

音频处理方法、装置、系统以及存储介质 Download PDF

Info

Publication number
WO2022021898A1
WO2022021898A1 PCT/CN2021/081459 CN2021081459W WO2022021898A1 WO 2022021898 A1 WO2022021898 A1 WO 2022021898A1 CN 2021081459 W CN2021081459 W CN 2021081459W WO 2022021898 A1 WO2022021898 A1 WO 2022021898A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio signal
metadata
rendering
sensor
playback device
Prior art date
Application number
PCT/CN2021/081459
Other languages
English (en)
French (fr)
Inventor
潘兴德
谭敏强
Original Assignee
北京全景声信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京全景声信息科技有限公司 filed Critical 北京全景声信息科技有限公司
Priority to EP21850364.7A priority Critical patent/EP4171066A4/en
Publication of WO2022021898A1 publication Critical patent/WO2022021898A1/zh
Priority to US18/156,579 priority patent/US20230156403A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R5/00Stereophonic arrangements
    • H04R5/04Circuit arrangements, e.g. for selective connection of amplifier inputs/outputs to loudspeakers, for loudspeaker detection, or for adaptation of settings to personal preferences or hearing impairments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • H04R1/1041Mechanical or electronic switches, or control elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R5/00Stereophonic arrangements
    • H04R5/033Headphones for stereophonic communication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S1/00Two-channel systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/008Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • H04S7/304For headphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2420/00Details of connection covered by H04R, not provided for in its groups
    • H04R2420/07Applications of wireless loudspeakers or wireless microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/01Multi-channel, i.e. more than two input channels, sound reproduction with two speakers wherein the multi-channel information is substantially preserved
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/01Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]

Definitions

  • the present application relates to the field of electronic technology, and in particular, to an audio processing method, device, system, and storage medium.
  • earphones have become a must-have for people's daily listening to sound. Due to its convenience, wireless earphones are more and more popular in the market, and even gradually become mainstream earphone products. It follows that people's requirements for sound quality are getting higher and higher, not only in the pursuit of lossless sound quality, but also in the pursuit of sound space and immersion. , and now more and more people have begun to pursue 360° surround sound and three-dimensional panoramic sound that is truly all-round immersion.
  • the existing wireless earphones such as traditional wireless Bluetooth earphones and TWS true wireless earphones, can only present the experience of two-channel stereo sound field, and are increasingly unable to meet people's actual needs, especially when watching movies. , and the need for sound orientation when playing games.
  • the present application provides an audio processing method, device, system and storage medium to solve the technical problem of how to present high-quality surround sound and panoramic sound effects for wireless headphones.
  • the present application provides an audio processing method, applied to a wireless headset, including:
  • the audio signal to be presented sent by the playback device is received by wireless transmission, and the audio signal to be presented includes a first audio signal and/or a second audio signal, wherein the first audio signal is after rendering processing by the playback device
  • the audio signal, the second audio signal is the audio signal to be rendered;
  • the audio signal to be presented includes the second audio signal, performing rendering processing on the second audio signal to obtain a third audio signal;
  • Subsequent audio playback is performed according to the first audio signal and/or the third audio signal.
  • the method before receiving the audio signal to be presented sent by the playback device through wireless transmission, the method includes:
  • the method before the sending an indication signal to the playback device through wireless transmission, the method further includes:
  • the method before the sending an indication signal to the playback device through wireless transmission, the method further includes:
  • Receive audio characteristic information sent by the playback device where the audio characteristic information includes characteristic parameters of the original audio signal input to the playback device, and the characteristic parameters include: stream format, channel parameters, object parameters, and At least one of the scene component parameters.
  • the indication signal includes an identification code
  • the playback device does not render the original audio signal, and the audio signal to be presented includes the second audio signal, but does not include the first audio signal,
  • the wireless headset performs all rendering on the original audio signal
  • the playback device performs all rendering on the original audio signal, the audio signal to be presented includes the first audio signal but does not include the second audio signal, the The wireless headset does not render the original audio signal;
  • the playback device partially renders the original audio signal, which includes the first audio signal and the second audio signal, and the wireless earphone has no effect on the original audio signal. The remainder of the audio signal is rendered.
  • the method further includes:
  • Decoding the to-be-presented audio signal to obtain the first audio signal and/or the second audio signal.
  • performing rendering processing on the second audio signal to obtain a third audio signal includes:
  • Rendering processing is performed on the second audio signal according to rendering metadata to obtain the third audio signal, wherein the rendering metadata includes first metadata and second metadata, and the first metadata is the The metadata on the playback device side, and the second metadata is metadata on the wireless headset side.
  • the first metadata includes headphone sensor metadata, wherein the headphone sensor metadata is used to characterize the motion characteristics of the playback device; and/or,
  • the second metadata includes playback device sensor metadata and a head-related transformation function HRTF database, wherein the playback device sensor metadata is used to characterize the motion characteristics of the wireless headset.
  • the headset sensor metadata is obtained through a headset sensor, and the headset sensor includes at least one of a gyro sensor, a head size sensor, a ranging sensor, a geomagnetic sensor, and an acceleration sensor; and/ or,
  • the playback device sensor metadata is obtained through a playback device sensor, and the playback device sensor includes at least one of a gyroscope sensor, a head size sensor, a ranging sensor, a geomagnetic sensor, and an acceleration sensor.
  • the wireless earphone includes a first wireless earphone and a second wireless earphone;
  • the first wireless earphone or the second wireless earphone is provided with the earphone sensor; or,
  • Both the first wireless earphone and the second wireless earphone are provided with the earphone sensor, then after the first wireless earphone and the second wireless earphone respectively obtain the earphone sensor metadata, a The headset sensor metadata is synchronized with each other.
  • the first wireless earphone and the second wireless earphone are used to establish a wireless connection with the playback device; the receiving the audio signal to be presented sent by the playback device through wireless transmission includes:
  • the first wireless earphone receives the first audio signal to be presented sent by the playback device, and the second wireless earphone receives the second audio signal to be presented sent by the playback device;
  • the rendering processing in the wireless headset includes:
  • the first wireless headset performs rendering processing on the first audio signal to be presented to obtain a first playback audio signal
  • the second wireless headset performs rendering processing on the second audio signal to be presented to obtain a second audio signal to be presented. play audio signal;
  • the first wireless headset plays the first playback audio signal
  • the second wireless headset plays the second playback audio signal.
  • the method before the first wireless headset performs rendering processing on the first audio signal to be presented, the method further includes:
  • the first wireless headset decodes the first audio signal to be presented to obtain a first decoded audio signal
  • the first wireless headset performs rendering processing on the first audio signal to be presented, including:
  • the first wireless headset performs rendering processing according to the first decoded audio signal and rendering metadata to obtain the first playback audio signal
  • the method further includes:
  • the second wireless headset decodes the second audio signal to be presented to obtain a second decoded audio signal
  • the second wireless headset performs rendering processing on the second audio signal to be presented, including:
  • the second wireless headset performs rendering processing according to the second decoded audio signal and the rendering metadata, so as to obtain the second playback audio signal.
  • the rendering metadata includes at least one of first wireless headset metadata, second wireless headset metadata, and playback device metadata.
  • the first wireless headset metadata includes first headset sensor metadata and a head-related transformation function HRTF database, wherein the first headset sensor metadata is used to characterize the first wireless headset movement characteristics;
  • the second wireless headset metadata includes second headset sensor metadata and a head-related transformation function HRTF database, wherein the second headset sensor metadata is used to characterize the motion characteristics of the second wireless headset;
  • the playback device metadata includes playback device sensor metadata, wherein the playback device sensor metadata is used to characterize motion characteristics of the playback device.
  • the method before performing the rendering process, the method further includes:
  • the first wireless headset synchronizes the rendering metadata with the second wireless headset.
  • the first wireless headset is provided with a headset sensor
  • the second wireless headset is not provided with a headset sensor
  • the playback device is not provided with a playback device sensor
  • the first wireless headset is not provided with a headset sensor.
  • a wireless headset synchronizes the rendering metadata with the second wireless headset, including:
  • the first wireless headset sends the first headset sensor metadata to the second wireless headset, and the second wireless headset uses the first headset sensor metadata as the second headset sensor metadata.
  • the first wireless earphone and the second wireless earphone are provided with earphone sensors, and the playback device is not provided with a playback device sensor, the first wireless earphone and all The second wireless headset synchronizes the rendering metadata, including:
  • the first wireless headset sends the first headset sensor metadata to the second wireless headset
  • the second wireless headset sends the second headset sensor metadata to the first wireless headset
  • the first wireless headset and the second wireless headset respectively determine the rendering metadata according to the first headset sensor metadata, the second headset sensor metadata, and a preset numerical algorithm; or,
  • the first wireless headset sends the first headset sensor metadata to the playback device
  • the second wireless headset sends the second headset sensor metadata to the playback device, so that the playback device determining the rendering metadata according to the first headphone sensor metadata, the second headphone sensor metadata, and a preset numerical algorithm
  • the first wireless headset and the second wireless headset respectively receive the rendering metadata.
  • the first wireless headset is provided with a headset sensor
  • the second wireless headset is not provided with a headset sensor
  • the playback device is provided with a playback device sensor
  • the first wireless headset is provided with a headset sensor.
  • the wireless headset synchronizes the rendering metadata with the second wireless headset, including:
  • the first wireless earphone sends the metadata of the first earphone sensor to the playback device, so that the playback device determines the first earphone sensor metadata according to the metadata of the first earphone sensor, the sensor metadata of the playback device and the preset numerical algorithm. Describe rendering metadata;
  • the first wireless headset and the second wireless headset respectively receive the rendering metadata; or,
  • the first wireless earphone receives the playback device sensor metadata sent by the playback device;
  • the first wireless headset determines the rendering metadata according to the first headset sensor metadata, the playback device sensor metadata, and a preset numerical algorithm
  • the first wireless headset sends the rendering metadata to the second wireless headset.
  • both the first wireless earphone and the second wireless earphone are provided with earphone sensors, and the playback device is provided with a playback device sensor, the first wireless earphone and the The second wireless headset synchronizes the rendering metadata, including:
  • the first wireless headset sends the first headset sensor metadata to the playback device
  • the second wireless headset sends the second headset sensor metadata to the playback device, so that the playback device Determine the rendering metadata according to the first headphone sensor metadata, the second headphone sensor metadata, the playback device sensor metadata, and a preset numerical algorithm
  • the first wireless headset and the second wireless headset respectively receive the rendering metadata; or,
  • the first wireless headset sends the first headset sensor metadata to the second wireless headset
  • the second wireless headset sends the second headset sensor metadata to the first wireless headset
  • the first wireless earphone and the second wireless earphone respectively receive the playback device sensor metadata
  • the first wireless headset and the second wireless headset determine the rendering element according to the first headset sensor metadata, the second headset sensor metadata, the playback device sensor metadata and a preset numerical algorithm, respectively. data.
  • the audio signal to be presented includes at least one of a channel-based audio signal, an object-based audio signal, and a scene-based audio signal.
  • the rendering process includes at least one of binaural virtual rendering, channel signal rendering, object signal rendering, and scene signal rendering.
  • the wireless transmission mode includes: Bluetooth communication, infrared communication, WIFI communication, and LIFI visible light communication.
  • the present application provides another audio processing method, which is applied to a playback device, including:
  • the audio signal to be presented includes a first audio signal and/or a second audio signal, wherein the first audio signal is a
  • the device renders the processed audio signal, and the second audio signal is the audio signal to be rendered;
  • the to-be-presented audio signal is sent to the wireless headset through wireless transmission.
  • the method before the to-be-presented audio signal sent to the wireless headset through wireless transmission, the method includes:
  • the indication signal sent by the wireless headset is received in the wireless transmission manner, and the indication signal is used to instruct the playback device to render the original audio signal according to a corresponding preset processing manner, so as to obtain the audio to be presented. Signal.
  • the method before the to-be-presented audio signal sent to the wireless headset by wireless transmission, the method further includes:
  • the performance parameters of the wireless headset are received through the wireless transmission method, and an indication signal is determined according to the performance parameters, and the indication signal is used to instruct the playback device to process the original audio signal according to a corresponding preset processing method. rendering to obtain the audio signal to be rendered.
  • the receiving the performance parameter of the wireless headset through the wireless transmission, and determining the indication signal according to the performance parameter includes:
  • characteristic parameters of the original audio signal include: at least one of a code stream format, a channel parameter, an object parameter, and a scene component parameter;
  • the indication signal is determined according to the characteristic parameter and the performance parameter.
  • the indication signal includes an identification code
  • the playback device does not render the original audio signal, and the audio signal to be presented includes the second audio signal and does not include the first audio signal , the wireless headset performs all rendering on the original audio signal;
  • the playback device performs all rendering of the original audio signal, and the to-be-presented audio signal includes the first audio signal but does not include the second audio signal.
  • the wireless headset does not render the original audio signal
  • the audio signal includes the first audio signal and the second audio signal. The remainder of the original audio signal is rendered.
  • the original audio signal includes a fourth audio signal and/or a fifth audio signal, wherein the fourth audio signal is used to generate the first audio signal after processing, and the fifth audio signal is used to generate the first audio signal. generating the second audio signal;
  • the method further includes:
  • the eighth audio signal and the ninth audio signal are encoded to obtain a tenth audio signal, and the to-be-presented audio signal includes the fifth audio signal and the tenth audio signal.
  • the rendering processing on the seventh audio signal includes:
  • Rendering processing is performed on the seventh audio signal according to rendering metadata to obtain the ninth audio signal, wherein the rendering metadata includes first metadata and second metadata, and the first metadata is the The metadata on the playback device side, and the second metadata is metadata on the wireless headset side.
  • the first metadata includes headphone sensor metadata, wherein the headphone sensor metadata is used to characterize the motion characteristics of the playback device; and/or,
  • the second metadata includes playback device sensor metadata and a head-related transformation function HRTF database, wherein the sensor metadata is used to characterize the motion characteristics of the wireless headset.
  • the headset sensor metadata is obtained through a headset sensor, and the headset sensor includes at least one of a gyro sensor, a head size sensor, a ranging sensor, a geomagnetic sensor, and an acceleration sensor; and/ or,
  • the playback device sensor metadata is obtained through a playback device sensor, and the playback device sensor includes at least one of a gyroscope sensor, a head size sensor, a ranging sensor, a geomagnetic sensor, and an acceleration sensor.
  • the audio signal to be presented includes at least one of a channel-based audio signal, an object-based audio signal, and a scene-based audio signal.
  • the rendering processing includes: at least one of binaural virtual rendering, channel signal rendering, object signal rendering, and scene signal rendering.
  • the wireless transmission mode includes: Bluetooth communication, infrared communication, WIFI communication, and LIFI visible light communication.
  • an audio processing device comprising:
  • An acquisition module configured to receive the audio signal to be presented sent by the playback device through wireless transmission, where the audio signal to be presented includes a first audio signal and/or a second audio signal, wherein the first audio signal is The playback device renders the processed audio signal, and the second audio signal is the audio signal to be rendered;
  • a rendering module configured to perform rendering processing on the second audio signal when the audio signal to be presented includes the second audio signal to obtain a third audio signal
  • a playback module configured to perform subsequent audio playback according to the first audio signal and/or the third audio signal.
  • the receiving module before the receiving module is used to receive the audio signal to be presented sent by the playback device through wireless transmission, it further includes:
  • a sending module configured to send an indication signal to the playback device through wireless transmission, where the indication signal is used to instruct the playback device to render the original audio signal according to a corresponding preset processing method to obtain the audio to be presented Signal.
  • the sending module before the sending module is used to send the indication signal to the playback device through wireless transmission, it further includes:
  • the acquiring module is further configured to acquire performance parameters of the wireless headset, and determine the indication signal according to the performance parameters.
  • the sending module before the sending module is used to send the indication signal to the playback device through wireless transmission, it further includes:
  • the acquisition module is further configured to receive audio characteristic information sent by the playback device, where the audio characteristic information includes characteristic parameters of the original audio signal input to the playback device, and the characteristic parameters include: a code stream format , at least one of channel parameters, object parameters, and scene component parameters.
  • the indication signal includes an identification code
  • the playback device does not render the original audio signal, and the audio signal to be presented includes the second audio signal, but does not include the first audio signal,
  • the audio processing device performs all rendering on the original audio signal
  • the playback device performs all rendering on the original audio signal, the audio signal to be presented includes the first audio signal but does not include the second audio signal, the The audio processing device does not render the original audio signal;
  • the playback device performs partial rendering on the original audio signal, which includes the first audio signal and the second audio signal, and the audio processing device performs partial rendering on the original audio signal. The remainder of the original audio signal is rendered.
  • the acquisition module after the acquisition module is used to receive the audio signal to be presented sent by the playback device through wireless transmission, it further includes:
  • a decoding module configured to perform decoding processing on the audio signal to be presented to obtain the first audio signal and/or the second audio signal.
  • the rendering module configured to perform rendering processing on the second audio signal to obtain a third audio signal, includes:
  • the rendering module is configured to perform rendering processing on the second audio signal according to rendering metadata to obtain the third audio signal, wherein the rendering metadata includes first metadata and second metadata, so The first metadata is metadata on the playback device, and the second metadata is metadata on the wireless headset.
  • the first metadata includes first sensing module metadata, wherein the first sensing module metadata is used to characterize the motion feature of the playback device; and/or,
  • the second metadata includes second sensing module metadata and a head-related transformation function HRTF database, wherein the second sensing module metadata is used to characterize the motion characteristics of the wireless headset.
  • the headset sensor metadata is obtained through a first sensing module, and the first sensing module includes a gyroscope sensing sub-module, a head size sensing sub-module, and a ranging sensor sub-module at least one of a module, a geomagnetic sensing sub-module and an acceleration sensing sub-module; and/or,
  • the sensor metadata of the playback device is obtained through a second sensing module, and the second sensing module includes a gyroscope sensing sub-module, a head size sensing sub-module, a ranging sensing sub-module, and a geomagnetic sensing sub-module and at least one of the acceleration sensing sub-modules.
  • the audio processing device includes a first audio processing device and a second audio processing device;
  • the first audio processing device or the second audio processing device is provided with the second sensing sub-module; or,
  • Both the first audio processing device and the second audio processing device are provided with the second sensing sub-module, and the acquisition module of the first audio processing device and the acquisition module of the second audio processing device , after obtaining the metadata of the playback device sensor, it further includes:
  • the synchronization module is used for synchronizing the metadata of the sensor of the playback device with each other.
  • the first audio processing device includes:
  • a first receiving module configured to receive the first audio signal to be presented sent by the playback device
  • a first rendering module configured to perform rendering processing on the first to-be-presented audio signal to obtain a first playback audio signal
  • a first playing module for playing the first playing audio signal
  • the second audio processing device includes:
  • a second receiving module configured to receive the second to-be-presented audio signal sent by the playback device
  • a second rendering module configured to perform rendering processing on the second to-be-presented audio signal to obtain a second playback audio signal
  • the second playing module is used for playing the second playing audio signal.
  • the first audio processing device further includes:
  • a first decoding module configured to perform decoding processing on the first to-be-presented audio signal to obtain a first decoded audio signal
  • the first rendering module is specifically configured to: perform rendering processing according to the first decoded audio signal and rendering metadata to obtain the first playback audio signal;
  • the second audio processing device further includes:
  • a second decoding module configured to perform decoding processing on the second to-be-presented audio signal to obtain a second decoded audio signal
  • the second rendering module is specifically configured to: perform rendering processing according to the second decoded audio signal and rendering metadata to obtain the second playback audio signal.
  • the rendering metadata includes at least one of first wireless headset metadata, second wireless headset metadata, and playback device metadata.
  • the first wireless headset metadata includes first headset sensor metadata and a head-related transformation function HRTF database, wherein the first headset sensor metadata is used to characterize the first wireless headset movement characteristics;
  • the second wireless headset metadata includes second headset sensor metadata and a head-related transformation function HRTF database, wherein the second headset sensor metadata is used to characterize the motion characteristics of the second wireless headset;
  • the playback device metadata includes playback device sensor metadata, wherein the playback device sensor metadata is used to characterize motion characteristics of the playback device.
  • the first audio processing device further includes:
  • a first synchronization module for synchronizing the rendering metadata with the second wireless headset
  • the second audio processing device further includes:
  • a second synchronization module configured to synchronize the rendering metadata with the first wireless headset.
  • the first synchronization module is specifically configured to: send the metadata of the first earphone sensor to the second wireless earphone, and use the second synchronization module to which the first synchronization module belongs.
  • Headphone sensor metadata is used as the second headphone sensor metadata.
  • the first synchronization module is specifically used for:
  • the second synchronization module is specifically used for:
  • the rendering metadata is determined according to the first headphone sensor metadata, the second headphone sensor metadata, and a preset numerical algorithm; or,
  • the first synchronization module is specifically used for:
  • the second synchronization module is specifically used for:
  • the rendering metadata is received.
  • the first synchronization module is specifically used for:
  • the first synchronization module is specifically used for:
  • the second synchronization module is specifically used for:
  • the rendering metadata is determined according to the first headphone sensor metadata, the second headphone sensor metadata, the playback device sensor metadata, and a preset numerical algorithm.
  • the audio signal to be presented includes at least one of a channel-based audio signal, an object-based audio signal, and a scene-based audio signal.
  • the rendering processing includes: at least one of binaural virtual rendering, channel signal rendering, object signal rendering, and scene signal rendering.
  • the wireless transmission mode includes: Bluetooth communication, infrared communication, WIFI communication, and LIFI visible light communication.
  • another audio processing device provided by this application includes:
  • an acquisition module configured to receive an original audio signal, and generate an audio signal to be presented according to the original audio signal, where the audio signal to be presented includes a first audio signal and/or a second audio signal, wherein the first audio signal for rendering the processed audio signal on the playback device, and the second audio signal is the audio signal to be rendered;
  • the sending module is used for sending the audio signal to be presented to the wireless headset through wireless transmission.
  • the method before the sending module is used to send the audio signal to be presented to the wireless headset through wireless transmission, the method includes:
  • the acquisition module is further configured to receive an indication signal sent by the wireless headset through the wireless transmission method, where the indication signal is used to instruct the playback device to render the original audio signal according to a corresponding preset processing method , to obtain the audio signal to be presented.
  • the sending module before the sending module is used to send the audio signal to be presented to the wireless headset by wireless transmission, it further includes:
  • the acquisition module is further configured to receive the performance parameter of the wireless headset through the wireless transmission, and determine an indication signal according to the performance parameter, where the indication signal is used to instruct the playback device to respond to the original audio signal Rendering is performed according to a corresponding preset processing manner to obtain the audio signal to be presented.
  • the acquisition module is further configured to receive the performance parameters of the wireless headset through the wireless transmission, and determine an indication signal according to the performance parameters, including:
  • the obtaining module is further configured to obtain characteristic parameters of the original audio signal, where the characteristic parameters include: at least one of a code stream format, a channel parameter, an object parameter and a scene component parameter;
  • the obtaining module is further configured to determine the indication signal according to the characteristic parameter and the performance parameter.
  • the indication signal includes an identification code
  • the playback device does not render the original audio signal, and the audio signal to be presented includes the second audio signal and does not include the first audio signal , the audio processing device performs all rendering on the original audio signal;
  • the playback device performs all rendering of the original audio signal, and the to-be-presented audio signal includes the first audio signal but does not include the second audio signal.
  • the audio processing device does not render the original audio signal
  • the playback device performs partial rendering of the original audio signal
  • the audio processing device includes the first audio signal and the second audio signal
  • the audio processing device performs partial rendering on the original audio signal. The remaining part of the original audio signal is rendered.
  • the original audio signal includes a fourth audio signal and/or a fifth audio signal, wherein the fourth audio signal is used to generate the first audio signal after processing, and the fifth audio signal is used to generate the first audio signal. generating the second audio signal;
  • the acquisition module is used to acquire the original audio signal, it further includes:
  • a decoding module configured to decode the fourth audio signal to obtain a sixth audio signal, where the sixth audio signal includes a seventh audio signal and/or an eighth audio signal;
  • a rendering module configured to perform rendering processing on the seventh audio signal to obtain a ninth audio signal
  • an encoding module configured to encode the eighth audio signal and the ninth audio signal to obtain a tenth audio signal, and the to-be-presented audio signal includes the fifth audio signal and the tenth audio signal.
  • the rendering module configured to perform rendering processing on the seventh audio signal, includes:
  • the rendering module is configured to perform rendering processing on the seventh audio signal according to rendering metadata to obtain the ninth audio signal, wherein the rendering metadata includes first metadata and second metadata, so The first metadata is metadata on the playback device, and the second metadata is metadata on the wireless headset.
  • the first metadata includes first sensing sub-module metadata, wherein the first sensing sub-module metadata is used to characterize the motion characteristics of the playback device; and/or ,
  • the second metadata includes second sensing sub-module metadata and a head-related transformation function HRTF database, wherein the sensing sub-module metadata is used to characterize the motion characteristics of the wireless headset.
  • the metadata of the first sensing sub-module is obtained through the first sensing sub-module, and the first sensing sub-module includes a gyroscope sensing sub-module and a head size sensing sub-module , at least one of a ranging sensing sub-module, a geomagnetic sensing sub-module and an acceleration sensing sub-module; and/or,
  • the metadata of the second sensing sub-module is obtained through the second sensing sub-module, and the second sensing sub-module includes a gyroscope sensing sub-module, a head size sensing sub-module, a ranging sensing sub-module, At least one of a geomagnetic sensor sub-module and an acceleration sensor sub-module.
  • the audio signal to be presented includes at least one of a channel-based audio signal, an object-based audio signal, and a scene-based audio signal.
  • the rendering processing includes: at least one of binaural virtual rendering, channel signal rendering, object signal rendering, and scene signal rendering.
  • the wireless transmission method includes: Bluetooth communication, infrared communication, WIFI communication, and LIFI visible light communication.
  • the present application also provides a wireless headset, including:
  • a memory for storing a computer program for the processor
  • the processor is configured to implement any one of the possible audio processing methods in the first aspect above by executing the computer program.
  • the present application also provides a playback device, comprising:
  • a memory for storing a computer program for the processor
  • the processor is configured to implement any one of the possible audio processing methods in the second aspect above by executing the computer program.
  • the present application further provides a storage medium, where a computer program is stored in the readable storage medium, and the computer program is used to execute any one of the possible audio processing methods provided in the first aspect.
  • the present application further provides a storage medium, where a computer program is stored in the readable storage medium, and the computer program is used to execute any one of the possible audio processing methods provided in the second aspect.
  • the present application further provides a system, including the wireless headset of the fifth aspect and the playback device of the sixth aspect.
  • a wireless earphone terminal receives an audio signal to be presented sent by a playback device through wireless transmission, and the audio signal to be presented includes an audio signal rendered and processed by the playback device.
  • the audio signal and the audio signal to be rendered are the second audio signal; then if the audio signal to be presented includes the second audio signal, the wireless earphone end performs rendering processing on the second audio signal to obtain the third audio signal; finally, the wireless earphone end Subsequent audio playback is performed according to the first audio signal and/or the third audio signal.
  • FIG. 1 is a schematic structural diagram of a wireless headset according to an exemplary embodiment of the present application
  • FIG. 2 is a schematic diagram of an application scenario of an audio processing method according to an exemplary embodiment of the present application
  • FIG. 3 is a schematic flowchart of an audio processing method according to an exemplary embodiment of the present application.
  • FIG. 4 is a schematic diagram of a rendering method included in an audio data rendering module provided by an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of an HRTF rendering method provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of another HRTF rendering method provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a data flow of audio signal rendering performed by a wireless headset terminal provided by an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of another audio processing method provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a data link of an audio processing signal in a playback device and a wireless headset provided by an embodiment of the present application;
  • FIG. 10 is a schematic flowchart of another audio processing method provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a rendering process of channel information of a TWS true wireless headset provided by an embodiment of the application;
  • FIG. 12 is a schematic structural diagram of an audio processing apparatus provided by an embodiment of the application.
  • FIG. 13 is a schematic structural diagram of another audio processing apparatus provided by an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of a wireless headset provided by the application.
  • FIG. 15 is a schematic structural diagram of another playback device provided by this application.
  • FIG. 1 is a schematic structural diagram of a wireless headset according to an exemplary embodiment of the present application
  • FIG. 2 is a schematic diagram of an application scenario of an audio processing method according to an exemplary embodiment of the present application.
  • the wireless transceiver device group communication method provided in this embodiment is applied to a wireless headset 10 , wherein the wireless headset 10 includes a first wireless headset 101 and a second wireless headset 102 , and the wireless headset 10
  • the communication connection between the wireless transceiver devices is through the first wireless link 103.
  • the communication connection between the wireless earphone 101 and the wireless earphone 102 in the wireless earphone 10 can be bidirectional or unidirectional. There is no specific limitation in the embodiment.
  • the above-mentioned wireless headset 10 and playback device 20 may be wireless transceiver devices that communicate according to standard wireless protocols, wherein the standard wireless protocol may be Bluetooth protocol, Wifi protocol, Lifi protocol, infrared wireless transmission protocol Etc., in this embodiment, the specific form of the wireless protocol is not limited.
  • a standard wireless protocol may be a Bluetooth protocol as an example.
  • the wireless earphone 10 may be a TWS (True Wireless Stereo) true wireless earphone. Or traditional Bluetooth headsets, etc.
  • FIG. 3 is a schematic flowchart of an audio processing method according to an exemplary embodiment of the present application. As shown in FIG. 3 , the audio processing method provided by this embodiment includes:
  • S301 Acquire an original audio signal, and generate an audio signal to be presented according to the original audio signal.
  • the playback device acquires the original audio signal, and preprocesses the original audio signal, which may include at least one preprocessing program such as decoding, rendering, and re-encoding.
  • the playback device can decode all or part of the original audio signal to obtain audio content data and audio characteristic information
  • the audio content data may include but is not limited to the channel content audio signal.
  • the audio characteristic information may include but is not limited to sound field type, sampling rate, bit rate information, etc.
  • Original audio signals include channel-based audio signals, such as AAC/AC3 streams, etc., object-based audio signals, such as ATMOS/MPEG-H streams, etc., scene-based audio signals, such as MPEG-H HOA streams, or Any combination of the above three audio signals, such as WANOS stream.
  • object-based audio signals such as ATMOS/MPEG-H streams, etc.
  • scene-based audio signals such as MPEG-H HOA streams, or Any combination of the above three audio signals, such as WANOS stream.
  • the original audio signal is a channel-based audio signal, such as AAC/AC3 code stream, etc.
  • AAC/AC3 code stream etc.
  • fully decode the audio code stream to obtain the audio content signal of each channel, and channel characteristic information such as: sound field type, sampling rate, bit rate etc.
  • the original audio signal is an object-based audio signal, such as ATMOS/MPEG-H code stream, etc.
  • object-based audio signal such as ATMOS/MPEG-H code stream, etc.
  • only the audio sound bed is decoded to obtain the audio content signal of each channel, as well as channel characteristic information, such as sound field type, sampling rate, bit rate, etc.
  • the original audio signal is a scene-based audio signal, such as MPEG-H HOA code stream
  • fully decode the audio code stream to obtain the audio content signal of each channel, as well as channel characteristic information, such as sound field type, sampling rate, bit rate etc.
  • the audio code stream is decoded according to the code stream decoding description of the above three signals to obtain the audio content signal of each channel, and the channel Characteristic information such as sound field type, sample rate, bit rate, etc.
  • the playback device may perform rendering processing on the decoded audio content data to obtain the rendered audio signal and metadata.
  • the audio content may include, but is not limited to, the audio content signal of the channel and the audio content signal of the object;
  • the metadata may include but not limited to channel characteristic information, such as sound field type, sampling rate, bit rate, etc., and the object's audio content.
  • the three-dimensional space information and the rendering metadata of the wireless headset may include but not limited to sensor metadata and HRTF (Head Related Transfer Function) database.
  • FIG. 4 is a schematic diagram of a rendering method included in an audio data rendering module provided by an embodiment of the present application.
  • the rendering mode includes but is not limited to any combination of the following rendering modes: HRTF rendering, channel rendering, object rendering, scene rendering, and the like.
  • FIG. 5 is a schematic flowchart of an HRTF rendering method provided by an embodiment of the present application. As shown in Figure 5, when the decoded audio signal is a channel signal, the specific steps of the rendering method include:
  • the audio signal of the channel is the content signal of the channel, which includes the number of channels
  • the basic metadata is the basic information of the channel, including information such as sound field type and sampling rate.
  • the basic metadata is used to construct the spatial distribution of each channel according to a preset algorithm.
  • the sensor metadata from the sensor in the rendering metadata is received, and the spatial distribution of each channel is rotated and transformed.
  • the specific coordinate conversion method can be calculated according to the conversion method of the general Cartesian coordinate system and the polar coordinate system, and will not be repeated here.
  • the corresponding filter array HRTF(i) is selected from the HRTF database data, and then the audio signals of each channel are filtered.
  • the sensor metadata may be provided by a combination of gyroscope sensors, geomagnetic devices, and accelerometers;
  • the HRTF database may be based on, but not limited to, other sensor metadata on wireless headsets, such as head The size sensor, or based on the front-end equipment with camera or photographing function, after the intelligent recognition of the human head, according to the physical characteristics of the listener's head, ears and other physical characteristics, personalized processing and adjustment are carried out to achieve personalized effects;
  • the HRTF database it can be stored in the wireless headset in advance, or the new HRTF database can be imported into it in a wired or wireless manner, and the HRTF database can be updated to achieve the purpose of personalization according to the above.
  • interpolation can be considered to obtain the HRTF data set of the corresponding angle during calculation; in addition, subsequent processing steps can be added after S505, including but not limited to equalization. (EQ), delay, reverb, etc.
  • FIG. 6 is a schematic flowchart of another HRTF rendering method provided by an embodiment of the present application. As shown in Figure 6, when the decoded audio signal is an object signal, the specific steps of the rendering method include:
  • the playback device can perform rendering processing on all or part of the channel audio signals, and the processing methods include but are not limited to downmixing of the number of channels (such as downmixing from 7.1 to 5.1), channel dimension The downmix (such as 5.1.4 downmix to 5.1) and so on.
  • the playback device can perform rendering processing on all or part of the input object audio signal, and render the object audio content to a specified position and a specified number of channels according to the object's metadata, so that it becomes channel audio signal.
  • the playback device can perform rendering processing on all or part of the input scene audio signal, and render the scene audio signal to the specified output channel according to the specified number of input channels and output channels, Make it a channel audio signal.
  • the playback device may re-encode the rendered audio data and the rendered metadata, and output the encoded audio stream as an audio signal to be presented and wirelessly transmit it to the wireless headset.
  • the playback device sends the audio signal to be presented to the wireless headset through wireless transmission.
  • the to-be-presented audio signal includes a first audio signal and/or a second audio signal, wherein the first audio signal is an audio signal rendered and processed by the playback device, and the second audio signal is The signal is the audio signal to be rendered.
  • the first audio signal is an audio signal that has already been rendered in the playback device
  • the second audio signal is a signal that has not been rendered by the playback device, and requires a headset for further rendering.
  • the wireless headset directly plays the first audio signal. Because some high-quality sound source data, such as lossless music, already have high sound quality or already contain corresponding rendering effects, no further rendering processing is required with headphones. Further, in some application scenarios, the user rarely performs violent head movements when using the wireless headset, and the demand for rendering is not high, so the wireless headset is not required for rendering.
  • the wireless headset needs to perform S303 to render the second audio signal.
  • the purpose of rendering processing is to enable the sound to present the effect of stereo surround sound and panoramic sound, to increase the sense of space of the sound, and to simulate the effect of people obtaining a sense of sound orientation to the sound, such as being able to identify a car. where the car is coming or going, and whether the car is approaching or moving away at high speed.
  • the wireless headset receives the audio signal to be presented sent by the playback device through wireless transmission, and when the audio signal to be presented is a compressed stream, the wireless headset decodes the audio signal to be presented, to obtain the first audio signal, and/or the second audio signal. That is, the audio signal to be presented needs to be decoded to obtain the first audio signal and/or the second audio signal.
  • the decoded first audio signal or second audio signal includes audio content data and audio characteristic information
  • the audio content data may include but is not limited to channel content audio signals
  • the audio characteristic information may include but Not limited to sound field type, sample rate, bit rate information, etc.
  • the wireless transmission methods include: Bluetooth communication, infrared communication, WIFI communication, and LIFI visible light communication.
  • Those skilled in the art can select a specific wireless transmission mode according to the actual situation, which is not limited to the above-mentioned situation, or select several wireless transmission modes to combine with each other to achieve the effect of information interaction between the playback device and the wireless headset.
  • the audio signal to be presented includes a second audio signal, perform rendering processing on the second audio signal to obtain a third audio signal.
  • the audio signal to be presented includes the second audio signal means that the audio signal to be presented only includes the second audio signal, or the audio signal to be presented contains both the first audio signal and the second audio signal.
  • FIG. 7 is a schematic diagram of a data flow of audio signal rendering performed by a wireless headset end according to an embodiment of the present application.
  • the audio signal 71 to be presented includes at least one of the first audio signal 721 and the second audio signal 722 , and the second audio signal 722 must be rendered by the wireless headset before it can be used as the subsequent playback audio 74 or the subsequent audio signal 722 A portion of the audio 74 is played for playback.
  • the rendering processing of the playback device and the wireless headset in this embodiment includes at least one of binaural virtual rendering, channel signal rendering, object signal rendering, and scene signal rendering.
  • the wireless earphones are traditional wireless Bluetooth earphones, that is, the two earphones are connected by wires and share related sensors, processing units, and the like. At this point it is rendered as follows:
  • the second audio signal includes audio content data and audio characteristic information, and the audio content is rendered to obtain the rendered audio signal and metadata.
  • the audio content may include, but is not limited to, the audio content signal of the channel and the audio content signal of the object; the metadata may include but not limited to channel characteristic information, such as sound field type, sampling rate, bit rate, etc., and the object's
  • the three-dimensional space information, as well as the rendering metadata on the wireless headset side may include but not limited to sensor metadata and HRTF database.
  • the specific rendering process is the same as the rendering principle of the playback device, and reference may be made to HRTF rendering shown in FIG. 5 and FIG. 6 , and other rendering methods of the playback device introduced in S302 .
  • performing rendering processing on the second audio signal to obtain a third audio signal includes:
  • Rendering processing is performed on the second audio signal according to rendering metadata to obtain the third audio signal, wherein the rendering metadata includes first metadata and second metadata, and the first metadata is the The metadata on the playback device side, and the second metadata is metadata on the wireless headset side.
  • the so-called metadata is the information describing the attributes of the data.
  • the first metadata is used to indicate the current motion state of the playback device, the signal transmission strength of the playback device, the direction of signal propagation, the distance between the playback device and the wireless headset or the relative motion state, etc.;
  • the second metadata is used to represent the motion state of the wireless headset. For example, when a person's head is swaying or shaking, the wireless headset will also follow the movement, and the second metadata can also include the relative movement distance of the left and right wireless headsets. , relative motion speed and acceleration and other information.
  • the first metadata and the second metadata together provide a rendering basis for realizing high-quality surround sound or panoramic sound effects.
  • a virtual reality device to play a first-person shooter game
  • he or she needs to listen to whether there is an enemy approaching while turning his head left and right to observe, or to determine the enemy's position through the sound of gunfights in the accessories.
  • the ambient sound of the device needs to be combined with the second metadata of the wireless headset and the first metadata of the playback device worn on the user or the playback device placed in the room, provided to the wireless headset and/or playback device, and integrated to render Raw audio data for realistic, high-quality sound playback.
  • the first metadata includes first sensor metadata, where the first sensor metadata is used to characterize a motion feature of the playback device; and/or,
  • the second metadata includes second sensor metadata and a head related transformation function HRTF database, wherein the second sensor metadata is used to characterize the motion characteristics of the wireless headset.
  • the first metadata may be detected by a first sensor, and the first sensor may be located on a playback device, a wireless headset, or other objects worn by the user, such as a smart bracelet or a smart watch .
  • the first metadata is the sensor metadata in Figure 5
  • the second sensor metadata is the sensor in Figure 5.
  • the metadata, head coherent transform function HRTF database is the HRTF database data in FIG. 5 . That is, the first metadata is used for the rendering of the playback device, and the second metadata is used for the rendering of the wireless headset.
  • the first sensor metadata is obtained through a first sensor, and the first sensor includes at least one of a gyro sensor, a head size sensor, a ranging sensor, a geomagnetic sensor, and an acceleration sensor; and/or ,
  • the second sensor metadata is obtained by a second sensor, and the second sensor includes at least one of a gyro sensor, a head size sensor, a ranging sensor, a geomagnetic sensor, and an acceleration sensor.
  • the wireless earphone includes a first wireless earphone and a second wireless earphone;
  • the second sensor is provided in the first wireless earphone or the second wireless earphone; or,
  • the second sensor is set in both the first wireless earphone and the second wireless earphone, after the first wireless earphone and the second wireless earphone respectively obtain the metadata of the second sensor,
  • the second sensor metadata is mutually synchronized.
  • S304 Perform subsequent audio playback according to the first audio signal and/or the third audio signal.
  • the wireless headset performs audio playback on the first audio signal and/or the third audio signal.
  • the audio signal to be presented transmitted by the playback device does not need to be displayed in the wireless headset.
  • the rendering part play it directly; when only the third audio signal is included, that is, the audio signal to be presented transmitted by the playback device needs to be rendered in the wireless headset to obtain the third audio signal, which is then processed by the wireless headset.
  • the combination algorithm is not limited, and those skilled in the art can select an appropriate combination algorithm implementation manner according to specific application scenarios.
  • a wireless earphone terminal receives an audio signal to be presented sent by a playback device through wireless transmission, and the audio signal to be presented includes an audio signal rendered and processed by the playback device, that is, a first audio signal and a to-be-rendered audio signal.
  • the audio signal is the second audio signal; then if the audio signal to be presented includes the second audio signal, the wireless earphone end performs rendering processing on the second audio signal to obtain the third audio signal; finally, the wireless earphone end according to the first audio signal and / or the third audio signal for subsequent audio playback.
  • FIG. 8 is a schematic flowchart of another audio processing method provided by an embodiment of the present application. As shown in Figure 8, the specific steps of the method include:
  • the playback device obtains the original audio signal from resource libraries such as internal memory, database, and the Internet.
  • the wireless earphone sends an indication signal to the playback device through wireless transmission.
  • the indication signal is used to instruct the playback device to render the original audio signal according to a corresponding preset processing manner, so as to acquire the audio signal to be presented.
  • the function of the indicator signal is to indicate the rendering processing capability of the wireless headset. For example, when the wireless headset itself has sufficient power and its processing capability is strong, in the handshake stage between the wireless headset and the playback device, that is, the stage of establishing a wireless connection, the The playback device indicates that a higher proportion of rendering tasks can be allocated to the wireless headset; when the wireless headset itself carries less power, its processing power is weak, or in order to maintain the wireless headset can work for a longer time, that is, it is in power saving mode. At this time, the wireless headset instructs the playback device to allocate a lower proportion of rendering tasks, or not to allocate rendering tasks to the wireless headset.
  • the wireless headset sends the performance parameters of the wireless headset through wireless transmission, and after receiving the performance parameters of the wireless headset, the playback device obtains the indication signal by querying the mapping table between the performance parameters and the indication signal, or uses The preset algorithm calculates the indicator signal according to the performance parameters.
  • the indication signal includes an identification code
  • the playback device does not render the original audio signal, and the audio signal to be presented includes the second audio signal, but does not include the first audio signal,
  • the wireless headset performs all rendering on the original audio signal
  • the playback device performs all rendering on the original audio signal, the audio signal to be presented includes the first audio signal but does not include the second audio signal, the The wireless headset does not render the original audio signal;
  • the playback device partially renders the original audio signal, which includes the first audio signal and the second audio signal, and the wireless earphone has no effect on the original audio signal. The remainder of the audio signal is rendered.
  • the indication information can be sent from the wireless headset to the playback device when the wireless headset is connected to the playback device for the first time, so that the processing resources of the playback device or the wireless headset need not be consumed subsequently.
  • the indication information can also be triggered and transmitted periodically, so as to be changed according to different playing contents, so that the sound quality of the wireless earphone can be dynamically adjusted.
  • the indication information may also trigger transmission according to user instructions received by sensors in the wireless headset.
  • FIG. 9 is a schematic diagram of a data link of an audio processing signal in a playback device and a wireless headset according to an embodiment of the present application.
  • the function of the indication signal is to guide the data flow of the original audio signal S0 .
  • the original audio signal S0 includes a fourth audio signal S01 and/or a fifth audio signal S02, wherein the fourth audio signal S01 is used to generate the first audio signal S40 after processing, and the fifth audio signal S02 is used for generating the second audio signal S41;
  • the playback device After the acquisition of the original audio signal S0, the playback device performs decoding processing on the fourth audio signal S01 to obtain a sixth audio signal S1, where the sixth audio signal S1 includes the seventh audio signal S11 and/or the eighth audio signal S11 audio signal S12;
  • the to-be-presented audio signal includes the fifth audio signal S02 and the tenth S audio signal S30 ;
  • performing rendering processing on the seventh audio signal S11 includes:
  • Rendering processing is performed on the seventh audio signal S11 according to rendering metadata to obtain the ninth audio signal S2, wherein the rendering metadata includes first metadata D3 and second metadata D5, the first The metadata D3 is the metadata of the playback device, and the second metadata D5 is the metadata of the wireless headset.
  • the audio signal transmission link shown in FIG. 9 there may be multiple data links from the original audio signal to the subsequently played audio, or there may be only one data link.
  • the indication signal and/or the original audio signal determine the specific usage of the data link.
  • the playback device sends the audio signal to be presented to the wireless headset through wireless transmission.
  • the audio signal to be presented includes a second audio signal, perform rendering processing on the second audio signal to obtain a third audio signal.
  • steps S804-S805 are similar to S302-S304 of the audio processing method shown in FIG. 3 , and details are not repeated here.
  • a wireless earphone terminal receives an audio signal to be presented sent by a playback device through wireless transmission, and the audio signal to be presented includes an audio signal rendered and processed by the playback device, that is, a first audio signal and a to-be-rendered audio signal.
  • the audio signal is the second audio signal; then if the audio signal to be presented includes the second audio signal, the wireless earphone end performs rendering processing on the second audio signal to obtain the third audio signal; finally, the wireless earphone end according to the first audio signal and / or the third audio signal for subsequent audio playback.
  • FIG. 10 is a schematic flowchart of still another audio processing method provided by an embodiment of the present application. As shown in Figure 10, the specific steps of the method include:
  • the playback device obtains the original audio signal, and the original audio signal may include lossless music, game audio, movie audio, and the like. Then, the playback device performs at least one of decoding, rendering, and re-encoding the original audio signal.
  • the playback device performs at least one of decoding, rendering, and re-encoding the original audio signal.
  • this step S1001 please refer to the description of the data link distribution in the playback device part in FIG. 9 in S803, which will not be repeated here.
  • the first wireless earphone receives the first audio signal to be presented sent by the playback device.
  • the second wireless earphone receives the second audio signal to be presented sent by the playback device.
  • the wireless earphone includes a first wireless earphone and a second wireless earphone, wherein the first wireless earphone and the second wireless earphone are used to establish a wireless connection with a playback device.
  • S10021 and S10022 may occur at the same time, and the sequence is not limited.
  • the first wireless headset performs rendering processing on the first audio signal to be presented, so as to obtain the first playback audio signal.
  • the second wireless headset performs rendering processing on the second to-be-presented audio signal to obtain a second playback audio signal.
  • S10031 and S10032 may occur at the same time, and the sequence is not limited.
  • the first wireless headset decodes the first audio signal to be presented to obtain a first decoded audio signal
  • the first wireless headset performs rendering processing on the first audio signal to be presented, including:
  • the first wireless headset performs rendering processing according to the first decoded audio signal and rendering metadata to obtain the first playback audio signal.
  • the second wireless headset decodes the second audio signal to be presented to obtain a second decoded audio signal
  • the second wireless headset performs rendering processing on the second audio signal to be presented, including:
  • the second wireless headset performs rendering processing according to the second decoded audio signal and the rendering metadata to obtain the second playback audio signal.
  • the rendering metadata includes at least one of first wireless headset metadata, second wireless headset metadata, and playback device metadata.
  • the first wireless headset metadata includes first headset sensor metadata and a head-related transformation function HRTF database, wherein the first headset sensor metadata is used to characterize the motion characteristics of the first wireless headset;
  • the second wireless headset metadata includes second headset sensor metadata and a head-related transformation function HRTF database, wherein the second headset sensor metadata is used to characterize the motion characteristics of the second wireless headset;
  • the playback device metadata includes playback device sensor metadata, wherein the playback device sensor metadata is used to characterize motion characteristics of the playback device.
  • the method before performing the rendering processing, the method further includes:
  • the first wireless headset synchronizes the rendering metadata with the second wireless headset.
  • the first wireless headset is provided with a headset sensor
  • the second wireless headset is not provided with a headset sensor
  • the playback device is not provided with a playback device sensor
  • the first wireless headset and the The second wireless headset synchronizes the rendering metadata, including:
  • the first wireless headset sends the first headset sensor metadata to the second wireless headset, and the second wireless headset uses the first headset sensor metadata as the second headset sensor metadata.
  • both the first wireless earphone and the second wireless earphone are provided with earphone sensors, and the playback device is not provided with a playback device sensor, the first wireless earphone and the second wireless earphone synchronize the Rendering metadata, including:
  • the first wireless headset sends the first headset sensor metadata to the second wireless headset
  • the second wireless headset sends the second headset sensor metadata to the first wireless headset
  • the first wireless headset and the second wireless headset respectively determine the rendering metadata according to the first headset sensor metadata, the second headset sensor metadata, and a preset numerical algorithm; or,
  • the first wireless headset sends the first headset sensor metadata to the playback device
  • the second wireless headset sends the second headset sensor metadata to the playback device, so that the playback device determining the rendering metadata according to the first headphone sensor metadata, the second headphone sensor metadata, and a preset numerical algorithm
  • the first wireless headset and the second wireless headset respectively receive the rendering metadata.
  • the first wireless headset is provided with a headset sensor
  • the second wireless headset is not provided with a headset sensor
  • the playback device is provided with a playback device sensor
  • the first wireless headset is provided with a headset sensor.
  • the headset synchronizes the rendering metadata with the second wireless headset, including:
  • the first wireless earphone sends the metadata of the first earphone sensor to the playback device, so that the playback device determines the first earphone sensor metadata according to the metadata of the first earphone sensor, the sensor metadata of the playback device and the preset numerical algorithm. Describe rendering metadata;
  • the first wireless headset and the second wireless headset respectively receive the rendering metadata; or,
  • the first wireless earphone receives the playback device sensor metadata sent by the playback device;
  • the first wireless headset determines the rendering metadata according to the first headset sensor metadata, the playback device sensor metadata, and a preset numerical algorithm
  • the first wireless headset sends the rendering metadata to the second wireless headset.
  • both the first wireless earphone and the second wireless earphone are provided with earphone sensors, and the playback device is provided with a playback device sensor, the first wireless earphone and all The second wireless headset synchronizes the rendering metadata, including:
  • the first wireless headset sends the first headset sensor metadata to the playback device
  • the second wireless headset sends the second headset sensor metadata to the playback device, so that the playback device Determine the rendering metadata according to the first headphone sensor metadata, the second headphone sensor metadata, the playback device sensor metadata, and a preset numerical algorithm
  • the first wireless headset and the second wireless headset respectively receive the rendering metadata; or,
  • the first wireless headset sends the first headset sensor metadata to the second wireless headset
  • the second wireless headset sends the second headset sensor metadata to the first wireless headset
  • the first wireless earphone and the second wireless earphone respectively receive the playback device sensor metadata
  • the first wireless headset and the second wireless headset determine the rendering element according to the first headset sensor metadata, the second headset sensor metadata, the playback device sensor metadata and a preset numerical algorithm, respectively. data.
  • the wireless earphones are TWS true wireless earphones, that is, the two earphones are separated and coupled wirelessly, the two earphones can have their own processing units and sensors, respectively. Then the first wireless earphone is the left earphone, and the second wireless earphone is the right earphone.
  • the synchronous rendering method of the first wireless earphone and the second wireless earphone is as follows:
  • FIG. 11 is a schematic diagram of a rendering process of channel information of a TWS true wireless headset according to an embodiment of the present application.
  • the first wireless headset plays the first playback audio signal.
  • the second wireless headset plays the second playback audio signal.
  • S10041 and S10042 may occur at the same time, and the sequence is not limited.
  • the audio signal to be presented includes at least one of a channel-based audio signal, an object-based audio signal, and a scene-based audio signal.
  • the rendering processing includes at least one of binaural virtual rendering, channel signal rendering, object signal rendering, and scene signal rendering.
  • wireless transmission methods include: Bluetooth communication, infrared communication, WIFI communication, and LIFI visible light communication.
  • one playback device can also be connected to multiple pairs of wireless earphones at the same time.
  • the rendering and distribution of audio information to the multiple pairs of wireless earphones can still be performed with reference to the above-mentioned embodiment, and the audio information can be rendered and allocated according to different
  • the processing capability of the wireless headset corresponds to the rendering division ratio of different playback devices and wireless headsets.
  • the rendering processing resources between each pair of wireless headphones can also be comprehensively scheduled by the playback device, that is, for wireless headphones with weak processing capabilities, other strong processing capabilities connected to the same playback device can be called. wireless headphones to assist in rendering audio information.
  • the first wireless earphone and the second wireless earphone end respectively receive the first audio signal to be presented and the second signal to be presented sent by the playback device through wireless transmission, respectively, and then correspondingly respectively receive the audio signal to be presented and the signal to be presented.
  • Rendering processing is performed to obtain the first playing audio signal and the second playing audio signal, and finally the first wireless earphone and the second wireless earphone play the corresponding playing audio signal respectively.
  • FIG. 12 is a schematic structural diagram of an audio processing apparatus provided by an embodiment of the present application. As shown in FIG. 12 , the audio processing apparatus 1200 provided in this embodiment includes:
  • An acquisition module configured to receive the audio signal to be presented sent by the playback device through wireless transmission, where the audio signal to be presented includes a first audio signal and/or a second audio signal, wherein the first audio signal is The playback device renders the processed audio signal, and the second audio signal is the audio signal to be rendered;
  • a rendering module configured to perform rendering processing on the second audio signal when the audio signal to be presented includes the second audio signal to obtain a third audio signal
  • a playback module configured to perform subsequent audio playback according to the first audio signal and/or the third audio signal.
  • the receiving module before the receiving module is used to receive the audio signal to be presented sent by the playback device through wireless transmission, it further includes:
  • a sending module configured to send an indication signal to the playback device through wireless transmission, where the indication signal is used to instruct the playback device to render the original audio signal according to a corresponding preset processing method to obtain the audio to be presented Signal.
  • the sending module before the sending module is used to send the indication signal to the playback device through wireless transmission, it further includes:
  • the acquiring module is further configured to acquire performance parameters of the wireless headset, and determine the indication signal according to the performance parameters.
  • the sending module before the sending module is used to send the indication signal to the playback device through wireless transmission, it further includes:
  • the acquisition module is further configured to receive audio characteristic information sent by the playback device, where the audio characteristic information includes characteristic parameters of the original audio signal input to the playback device, and the characteristic parameters include: a code stream format , at least one of channel parameters, object parameters, and scene component parameters.
  • the indication signal includes an identification code
  • the playback device does not render the original audio signal, and the audio signal to be presented includes the second audio signal, but does not include the first audio signal,
  • the wireless headset performs all rendering on the original audio signal
  • the playback device performs all rendering on the original audio signal, the audio signal to be presented includes the first audio signal but does not include the second audio signal, the The wireless headset does not render the original audio signal;
  • the playback device partially renders the original audio signal, which includes the first audio signal and the second audio signal, and the wireless earphone has no effect on the original audio signal. The remainder of the audio signal is rendered.
  • the acquisition module after the acquisition module is used to receive the audio signal to be presented sent by the playback device through wireless transmission, it further includes:
  • a decoding module configured to perform decoding processing on the audio signal to be presented to obtain the first audio signal and/or the second audio signal.
  • the rendering module configured to perform rendering processing on the second audio signal to obtain a third audio signal, includes:
  • the rendering module is configured to perform rendering processing on the second audio signal according to rendering metadata to obtain the third audio signal, wherein the rendering metadata includes first metadata and second metadata, so The first metadata is metadata on the playback device, and the second metadata is metadata on the wireless headset.
  • the first metadata includes first sensing module metadata, wherein the first sensing module metadata is used to characterize the motion characteristics of the playback device; and/or,
  • the second metadata includes second sensing module metadata and a head-related transformation function HRTF database, wherein the second sensing module metadata is used to characterize the motion characteristics of the wireless headset.
  • the headset sensor metadata is obtained through a first sensing module, and the first sensing module includes a gyroscope sensing sub-module, a head size sensing sub-module, and a ranging sensor sub-module at least one of a module, a geomagnetic sensing sub-module and an acceleration sensing sub-module; and/or,
  • the sensor metadata of the playback device is obtained through a second sensing module, and the second sensing module includes a gyroscope sensing sub-module, a head size sensing sub-module, a ranging sensing sub-module, and a geomagnetic sensing sub-module and at least one of the acceleration sensing sub-modules.
  • the audio processing device includes a first audio processing device and a second audio processing device;
  • the first audio processing device or the second audio processing device is provided with the second sensing sub-module; or,
  • Both the first audio processing device and the second audio processing device are provided with the second sensing sub-module, and the acquisition module of the first audio processing device and the acquisition module of the second audio processing device , after obtaining the metadata of the playback device sensor, it further includes:
  • the synchronization module is used for synchronizing the metadata of the sensor of the playback device with each other.
  • the first audio processing device includes:
  • a first receiving module configured to receive the first audio signal to be presented sent by the playback device
  • a first rendering module configured to perform rendering processing on the first audio signal to be presented to obtain a first playback audio signal
  • a first playing module for playing the first playing audio signal
  • the second audio processing device includes:
  • a second receiving module configured to receive the second to-be-presented audio signal sent by the playback device
  • a second rendering module configured to perform rendering processing on the second to-be-presented audio signal to obtain a second playback audio signal
  • the second playing module is used for playing the second playing audio signal.
  • the first audio processing device further includes:
  • a first decoding module configured to perform decoding processing on the first to-be-presented audio signal to obtain a first decoded audio signal
  • the first rendering module is specifically configured to: perform rendering processing according to the first decoded audio signal and rendering metadata to obtain the first playback audio signal;
  • the second audio processing device further includes:
  • a second decoding module configured to perform decoding processing on the second to-be-presented audio signal to obtain a second decoded audio signal
  • the second rendering module is specifically configured to: perform rendering processing according to the second decoded audio signal and rendering metadata to obtain the second playback audio signal.
  • the rendering metadata includes at least one of first wireless headset metadata, second wireless headset metadata, and playback device metadata.
  • the first wireless headset metadata includes first headset sensor metadata and a head-related transformation function HRTF database, wherein the first headset sensor metadata is used to characterize the first wireless headset movement characteristics;
  • the second wireless headset metadata includes second headset sensor metadata and a head-related transformation function HRTF database, wherein the second headset sensor metadata is used to characterize the motion characteristics of the second wireless headset;
  • the playback device metadata includes playback device sensor metadata, wherein the playback device sensor metadata is used to characterize motion characteristics of the playback device.
  • the first audio processing device further includes:
  • a first synchronization module for synchronizing the rendering metadata with the second wireless headset
  • the second audio processing device further includes:
  • a second synchronization module configured to synchronize the rendering metadata with the first wireless headset.
  • the first synchronization module is specifically configured to: send metadata of the first earphone sensor to the second wireless earphone, and use the second synchronization module to which the first synchronization module belongs.
  • Headphone sensor metadata is used as the second headphone sensor metadata.
  • the first synchronization module is specifically used for:
  • the second synchronization module is specifically used for:
  • the rendering metadata is determined according to the first headphone sensor metadata, the second headphone sensor metadata, and a preset numerical algorithm; or,
  • the first synchronization module is specifically used for:
  • the second synchronization module is specifically used for:
  • the rendering metadata is received.
  • the first synchronization module is specifically used for:
  • the first synchronization module is specifically used for:
  • the second synchronization module is specifically used for:
  • the rendering metadata is determined according to the first headphone sensor metadata, the second headphone sensor metadata, the playback device sensor metadata, and a preset numerical algorithm.
  • the audio signal to be presented includes at least one of a channel-based audio signal, an object-based audio signal, and a scene-based audio signal.
  • the rendering processing includes: at least one of binaural virtual rendering, channel signal rendering, object signal rendering, and scene signal rendering.
  • the wireless transmission mode includes: Bluetooth communication, infrared communication, WIFI communication, and LIFI visible light communication.
  • the audio processing device provided by the embodiment shown in FIG. 12 can execute the method corresponding to the wireless earphone end provided by any of the above method embodiments, and its specific implementation principles, technical features, technical terms and technical effects are similar. , and will not be repeated here.
  • FIG. 13 is a schematic structural diagram of another audio processing apparatus provided by an embodiment of the present application. As shown in FIG. 13 , the audio processing apparatus 1300 provided in this embodiment includes:
  • an acquisition module configured to receive an original audio signal, and generate an audio signal to be presented according to the original audio signal, where the audio signal to be presented includes a first audio signal and/or a second audio signal, wherein the first audio signal for rendering the processed audio signal on the playback device, and the second audio signal is the audio signal to be rendered;
  • the sending module is used for sending the audio signal to be presented to the wireless headset through wireless transmission.
  • the method before the sending module is used to send the audio signal to be presented to the wireless headset through wireless transmission, the method includes:
  • the acquisition module is further configured to receive an indication signal sent by the wireless headset through the wireless transmission method, where the indication signal is used to instruct the playback device to render the original audio signal according to a corresponding preset processing method , to obtain the audio signal to be presented.
  • the sending module before the sending module is used to send the audio signal to be presented to the wireless headset by wireless transmission, it further includes:
  • the acquisition module is further configured to receive the performance parameter of the wireless headset through the wireless transmission, and determine an indication signal according to the performance parameter, where the indication signal is used to instruct the playback device to respond to the original audio signal Rendering is performed according to a corresponding preset processing manner to obtain the audio signal to be presented.
  • the acquisition module is further configured to receive the performance parameters of the wireless headset through the wireless transmission, and determine an indication signal according to the performance parameters, including:
  • the obtaining module is further configured to obtain characteristic parameters of the original audio signal, where the characteristic parameters include: at least one of a code stream format, a channel parameter, an object parameter and a scene component parameter;
  • the obtaining module is further configured to determine the indication signal according to the characteristic parameter and the performance parameter.
  • the indication signal includes an identification code
  • the playback device does not render the original audio signal, and the audio signal to be presented includes the second audio signal and does not include the first audio signal , the wireless headset performs all rendering on the original audio signal;
  • the playback device performs all rendering of the original audio signal, and the to-be-presented audio signal includes the first audio signal but does not include the second audio signal.
  • the wireless headset does not render the original audio signal
  • the audio signal includes the first audio signal and the second audio signal. The remainder of the original audio signal is rendered.
  • the original audio signal includes a fourth audio signal and/or a fifth audio signal, wherein the fourth audio signal is used to generate the first audio signal after processing, and the fifth audio signal is used to generate the first audio signal. generating the second audio signal;
  • the acquisition module is used to acquire the original audio signal, it further includes:
  • a decoding module configured to decode the fourth audio signal to obtain a sixth audio signal, where the sixth audio signal includes a seventh audio signal and/or an eighth audio signal;
  • a rendering module configured to perform rendering processing on the seventh audio signal to obtain a ninth audio signal
  • an encoding module configured to encode the eighth audio signal and the ninth audio signal to obtain a tenth audio signal, and the to-be-presented audio signal includes the fifth audio signal and the tenth audio signal.
  • the rendering module configured to perform rendering processing on the seventh audio signal, includes:
  • the rendering module is configured to perform rendering processing on the seventh audio signal according to rendering metadata to obtain the ninth audio signal, wherein the rendering metadata includes first metadata and second metadata, so The first metadata is metadata on the playback device, and the second metadata is metadata on the wireless headset.
  • the first metadata includes first sensing sub-module metadata, wherein the first sensing sub-module metadata is used to characterize the motion characteristics of the playback device; and/or ,
  • the second metadata includes second sensing sub-module metadata and a head-related transformation function HRTF database, wherein the sensing sub-module metadata is used to characterize the motion characteristics of the wireless headset.
  • the metadata of the first sensing sub-module is obtained through the first sensing sub-module, and the first sensing sub-module includes a gyroscope sensing sub-module and a head size sensing sub-module , at least one of a ranging sensing sub-module, a geomagnetic sensing sub-module and an acceleration sensing sub-module; and/or,
  • the metadata of the second sensing sub-module is obtained through the second sensing sub-module, and the second sensing sub-module includes a gyroscope sensing sub-module, a head size sensing sub-module, a ranging sensing sub-module, At least one of a geomagnetic sensor sub-module and an acceleration sensor sub-module.
  • the audio signal to be presented includes at least one of a channel-based audio signal, an object-based audio signal, and a scene-based audio signal.
  • the rendering processing includes: at least one of binaural virtual rendering, channel signal rendering, object signal rendering, and scene signal rendering.
  • the wireless transmission method includes: Bluetooth communication, infrared communication, WIFI communication, and LIFI visible light communication.
  • the audio processing apparatus provided by the embodiment shown in FIG. 13 can execute the method corresponding to the playback device provided by any of the above method embodiments, and its specific implementation principles, technical features, technical terms and technical effects are similar. , and will not be repeated here.
  • FIG. 14 is a schematic structural diagram of a wireless headset provided by the application.
  • the electronic device 1400 may include: at least one processor 1401 and a memory 1402 .
  • FIG. 14 shows an electronic device with a processor as an example.
  • the memory 1402 is used to store programs.
  • the program may include program code, and the program code includes computer operation instructions.
  • Memory 1402 may include high-speed RAM memory, and may also include non-volatile memory, such as at least one disk memory.
  • the processor 1401 is configured to execute the computer-executed instructions stored in the memory 1402, so as to implement the methods corresponding to the wireless earphone terminals described in the above method embodiments.
  • the processor 1401 may be a central processing unit (central processing unit, referred to as CPU), or a specific integrated circuit (application specific integrated circuit, referred to as ASIC), or is configured to implement one or more of the embodiments of the present application. multiple integrated circuits.
  • CPU central processing unit
  • ASIC application specific integrated circuit
  • the memory 1402 may be independent or integrated with the processor 1401.
  • the electronic device 1400 may further include:
  • the bus 1403 is used to connect the processor 1401 and the memory 1402 .
  • the bus may be an industry standard architecture (abbreviated as ISA) bus, a peripheral component (PCI) bus, or an extended industry standard architecture (EISA) bus, or the like. Buses can be divided into address bus, data bus, control bus, etc., but it does not mean that there is only one bus or one type of bus.
  • ISA industry standard architecture
  • PCI peripheral component
  • EISA extended industry standard architecture
  • the memory 1402 and the processor 1401 may communicate through an internal interface.
  • FIG. 15 is a schematic structural diagram of another playback device provided by this application.
  • the electronic device 1500 may include: at least one processor 1501 and a memory 1502 .
  • FIG. 15 shows an electronic device with a processor as an example.
  • the memory 1502 is used to store programs.
  • the program may include program code, and the program code includes computer operation instructions.
  • Memory 1502 may include high-speed RAM memory, and may also include non-volatile memory, such as at least one disk memory.
  • the processor 1501 is configured to execute the computer-executed instructions stored in the memory 502 to implement the methods corresponding to the playback device described in the above method embodiments.
  • the processor 1501 may be a central processing unit (central processing unit, referred to as CPU), or a specific integrated circuit (application specific integrated circuit, referred to as ASIC), or is configured to implement one or more of the embodiments of the present application. multiple integrated circuits.
  • CPU central processing unit
  • ASIC application specific integrated circuit
  • the memory 1502 may be independent or integrated with the processor 1501 .
  • the electronic device 1500 may further include:
  • the bus 1503 is used to connect the processor 1501 and the memory 1502 .
  • the bus may be an industry standard architecture (abbreviated as ISA) bus, a peripheral component (PCI) bus, or an extended industry standard architecture (EISA) bus or the like. Buses can be divided into address bus, data bus, control bus, etc., but it does not mean that there is only one bus or one type of bus.
  • ISA industry standard architecture
  • PCI peripheral component
  • EISA extended industry standard architecture
  • the memory 1502 and the processor 1501 can communicate through an internal interface.
  • the present application also provides a computer-readable storage medium
  • the computer-readable storage medium may include: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM) ), a magnetic disk or an optical disk and other media that can store program codes.
  • the computer-readable storage medium stores program instructions, and the program instructions are used for the methods corresponding to the wireless earphone terminals in the above embodiments.
  • the present application also provides a computer-readable storage medium
  • the computer-readable storage medium may include: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM) ), a magnetic disk or an optical disk and other media that can store program codes.
  • the computer-readable storage medium stores program instructions, and the program instructions are used for the methods corresponding to the playback device in the above embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Headphones And Earphones (AREA)
  • Stereophonic System (AREA)

Abstract

本申请提供一种音频处理方法、装置、系统以及存储介质,首先无线耳机端通过无线传输方式接收播放设备发送的待呈现音频信号,待呈现音频信号包括播放设备渲染处理后的音频信号即第一音频信号以及待渲染的音频信号即第二音频信号;然后若待呈现音频信号包括第二音频信号,则无线耳机端对第二音频信号进行渲染处理,以获得第三音频信号;最后无线耳机端根据第一音频信号和/或第三音频信号进行后续音频播放。从而实现无线耳机能够呈现高品质环绕声和全景声效果的技术效果。

Description

音频处理方法、装置、系统以及存储介质
本申请要求于2020年07月31日提交中国专利局、申请号为202010762076.3、申请名称为“音频处理方法、装置、系统以及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电子技术领域,尤其涉及一种音频处理方法、装置、系统以及存储介质。
背景技术
随着智能移动设备的发展,耳机成为人们日常收听声音的必备品。而无线耳机由于其便利性,越来越受到市场青睐,甚至逐渐成为了主流耳机产品。随之而来的是人们对于声音品质的要求也越来越高,不仅在音质上逐渐追求无损化,在声音的空间感和沉浸感上的追求也逐步提升,从最初的单声道、立体声、到现在更多开始追求360°环绕声和真正全方位沉浸感的三维全景声。
目前,现有的无线耳机,如传统无线蓝牙耳机和TWS真无线耳机,只能呈现双声道立体声声场的体验感也越来越不能满足人们的实际需求,特别是看电影时对声音空间感的需求,以及玩游戏时对声音方位感定位的需求。
因此,如何在耳机端,特别是在现在日益流行的无线耳机端,呈现真正的环绕声和全景声效果成为了亟待解决的技术问题。
发明内容
本申请提供一种音频处理方法、装置、系统以及存储介质,以解决无线耳机如何呈现高品质环绕声和全景声效果的技术问题。
第一方面,本申请提供一种音频处理方法,应用于无线耳机,包括:
通过无线传输方式接收播放设备发送的待呈现音频信号,所述待呈现音频信号包括第一音频信号和/或第二音频信号,其中,所述第一音频信号为在所述播放设备渲染处理后的音频信号,所述第二音频信号为待渲染的音频信号;
若所述待呈现音频信号包括所述第二音频信号,则对所述第二音频信号进行渲染处理,以获得第三音频信号;
根据所述第一音频信号和/或所述第三音频信号进行后续音频播放。
在一种可能的设计中,在所述通过无线传输方式接收播放设备发送的待呈现音频信号之前,包括:
通过无线传输方式向所述播放设备发送指示信号,所述指示信号用于指示所述播放设备对原始音频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。
在一种可能的设计中,在所述通过无线传输方式向所述播放设备发送指示信号之前,还包括:
获取所述无线耳机的性能参数,并根据所述性能参数确定所述指示信号。
在一种可能的设计中,在所述通过无线传输方式向所述播放设备发送指示信号之前,还包括:
接收所述播放设备发送的音频特性信息,所述音频特性信息包括输入至所述播放设备的所述原始音频信号的特性参数,所述特性参数包括:码流格式、声道参数、对象参数以及场景成分参数中的至少一种。
可选的,所述指示信号包括标识码;
其中,若所述标识码为第一字段,则所述播放设备未对所述原始音频信号进行渲染,所述待呈现音频信号包括所述第二音频信号,未包括所述第一音频信号,所述无线耳机对所述原始音频信号进行全部渲染;
若所述标识码为第二字段,则所述播放设备对所述原始音频信号进行全部渲染,所述待呈现音频信号包括所述第一音频信号,未包括所述第二音频信号,所述无线耳机未对所述原始音频信号进行渲染;
若所述标识码为第三字段,则所述播放设备对所述原始音频信号进行部分渲染,所述包括所述第一音频信号和所述第二音频信号,所述无线耳机对所述原始音频信号剩余部分进行渲染。
可选的,在所述通过无线传输方式接收播放设备发送的待呈现音频信号之后,还包括:
对所述待呈现音频信号进行解码处理,以获得所述第一音频信号和/或所述第二音频信号。
可选的,所述对所述第二音频信号进行渲染处理,以获得第三音频信号, 包括:
根据渲染元数据对所述第二音频信号进行渲染处理,以获得所述第三音频信号,其中,所述渲染元数据包括第一元数据以及第二元数据,所述第一元数据为所述播放设备端的元数据,所述第二元数据为无线耳机端的元数据。
在一种可能的设计中,所述第一元数据包括耳机传感器元数据,其中,所述耳机传感器元数据用于表征所述播放设备的运动特征;和/或,
所述第二元数据包括播放设备传感器元数据以及头相关变换函数HRTF数据库,其中,所述播放设备传感器元数据用于表征所述无线耳机的运动特征。
在一种可能的设计中,所述耳机传感器元数据通过耳机传感器获得,所述耳机传感器包括陀螺仪传感器、头部大小传感器、测距传感器、地磁传感器以及加速度传感器中的至少一种;和/或,
所述播放设备传感器元数据通过播放设备传感器获得,所述播放设备传感器包括陀螺仪传感器、头部大小传感器、测距传感器、地磁传感器以及加速度传感器中的至少一种。
在一种可能的设计中,所述无线耳机包括第一无线耳机以及第二无线耳机;
所述第一无线耳机或所述第二无线耳机中设置有所述耳机传感器;或者,
所述第一无线耳机与所述第二无线耳机中均设置有所述耳机传感器,则在所述第一无线耳机与所述第二无线耳机分别获取到所述耳机传感器元数据之后,对所述耳机传感器元数据进行相互同步。
在一种可能的设计中,所述第一无线耳机与所述第二无线耳机用于与所述播放设备建立无线连接;所述通过无线传输方式接收播放设备发送的待呈现音频信号,包括:
所述第一无线耳机接收所述播放设备发送的第一待呈现音频信号,所述第二无线耳机接收所述播放设备发送的第二待呈现音频信号;
对应的,在所述无线耳机中的渲染处理,包括:
所述第一无线耳机对所述第一待呈现音频信号进行渲染处理,以获取第一播放音频信号,所述第二无线耳机对所述第二待呈现音频信号进行渲染处理,以获取第二播放音频信号;
所述第一无线耳机播放所述第一播放音频信号,所述第二无线耳机播放 所述第二播放音频信号。
在一种可能的设计中,在所述第一无线耳机对所述第一待呈现音频信号进行渲染处理之前,还包括:
所述第一无线耳机对所述第一待呈现音频信号进行解码处理,以获取第一解码音频信号;
对应的,所述第一无线耳机对所述第一待呈现音频信号进行渲染处理,包括:
所述第一无线耳机根据所述第一解码音频信号以及渲染元数据进行渲染处理,以获取所述第一播放音频信号;以及
在所述第二无线耳机对所述第二待呈现音频信号进行渲染处理之前,还包括:
所述第二无线耳机对所述第二待呈现音频信号进行解码处理,以获取第二解码音频信号;
对应的,所述第二无线耳机对所述第二待呈现音频信号进行渲染处理,包括:
所述第二无线耳机根据所述第二解码音频信号以及渲染元数据进行渲染处理,以获取所述第二播放音频信号。
在一种可能的设计中,所述渲染元数据包括第一无线耳机元数据、第二无线耳机元数据以及播放设备元数据中的至少一种。
在一种可能的设计中,所述第一无线耳机元数据包括第一耳机传感器元数据以及头相关变换函数HRTF数据库,其中,所述第一耳机传感器元数据用于表征所述第一无线耳机的运动特征;
所述第二无线耳机元数据包括第二耳机传感器元数据以及头相关变换函数HRTF数据库,其中,所述第二耳机传感器元数据用于表征所述第二无线耳机的运动特征;
所述播放设备元数据包括播放设备传感器元数据,其中,所述播放设备传感器元数据用于表征所述播放设备的运动特征。
在一种可能的设计中,在进行所述渲染处理之前,还包括:
所述第一无线耳机与所述第二无线耳机同步所述渲染元数据。
在一种可能的设计中,若所述第一无线耳机上设置有耳机传感器,所述第二无线耳机上未设置有耳机传感器,所述播放设备上未设置有播放设备传 感器,则所述第一无线耳机与所述第二无线耳机同步所述渲染元数据,包括:
所述第一无线耳机将所述第一耳机传感器元数据发送至所述第二无线耳机,所述第二无线耳机将所述第一耳机传感器元数据作为所述第二耳机传感器元数据。
在一种可能的设计中,若所述第一无线耳机与所述第二无线耳机上均设置有耳机传感器,所述播放设备上未设置有播放设备传感器,则所述第一无线耳机与所述第二无线耳机同步所述渲染元数据,包括:
所述第一无线耳机将所述第一耳机传感器元数据发送至所述第二无线耳机,所述第二无线耳机将所述第二耳机传感器元数据发送至所述第一无线耳机;
所述第一无线耳机与所述第二无线耳机分别根据所述第一耳机传感器元数据、所述第二耳机传感器元数据以及预设数值算法确定所述渲染元数据;或者,
所述第一无线耳机将所述第一耳机传感器元数据发送至所述播放设备,所述第二无线耳机将所述第二耳机传感器元数据发送至所述播放设备,以使所述播放设备根据所述第一耳机传感器元数据、所述第二耳机传感器元数据以及预设数值算法确定所述渲染元数据;
所述第一无线耳机与所述第二无线耳机分别接收所述渲染元数据。
在一种可能的设计中,若所述第一无线耳机上设置有耳机传感器,所述第二无线耳机上未设置有耳机传感器,所述播放设备上设置有播放设备传感器,则所述第一无线耳机与所述第二无线耳机同步所述渲染元数据,包括:
所述第一无线耳机将所述第一耳机传感器元数据发送至所述播放设备,以使播放设备根据所述第一耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据;
所述第一无线耳机与所述第二无线耳机分别接收所述渲染元数据;或者,
所述第一无线耳机接收所述播放设备发送的播放设备传感器元数据;
所述第一无线耳机根据所述第一耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据;
所述第一无线耳机将所述渲染元数据发送至所述第二无线耳机。
在一种可能的设计中,若所述第一无线耳机与所述第二无线耳机上均设置有耳机传感器,所述播放设备上设置有播放设备传感器,则所述第一无线 耳机与所述第二无线耳机同步所述渲染元数据,包括:
所述第一无线耳机将所述第一耳机传感器元数据发送至所述播放设备,所述第二无线耳机将所述第二耳机传感器元数据发送至所述播放设备,以使所述播放设备根据所述第一耳机传感器元数据、所述第二耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据;
所述第一无线耳机与所述第二无线耳机分别接收所述渲染元数据;或者,
所述第一无线耳机将所述第一耳机传感器元数据发送至所述第二无线耳机,所述第二无线耳机将所述第二耳机传感器元数据发送至所述第一无线耳机;
所述第一无线耳机与所述第二无线耳机分别接收所述播放设备传感器元数据;
所述第一无线耳机以及所述第二无线耳机分别根据所述第一耳机传感器元数据、所述第二耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据。
可选的,所述待呈现音频信号包括基于声道的音频信号、基于对象的音频信号、基于场景的音频信号中的至少一种。
可选的,其特征在于,所述渲染处理包括:双耳虚拟渲染、声道信号渲染、对象信号渲染以及场景信号渲染中的至少一种。
可选的,所述无线传输方式包括:蓝牙通信、红外线通信、WIFI通信、LIFI可见光通信。
第二方面,本申请提供另一种音频处理方法,应用于播放设备,包括:
获取原始音频信号,并根据所述原始音频信号生成待呈现音频信号,所述待呈现音频信号包括第一音频信号和/或第二音频信号,其中,所述第一音频信号为在所述播放设备渲染处理后的音频信号,所述第二音频信号为待渲染的音频信号;
通过无线传输方式向无线耳机发送的所述待呈现音频信号。
在一种可能的设计中,在所述通过无线传输方式向无线耳机发送的待呈现音频信号之前,包括:
通过所述无线传输方式接收所述无线耳机发送的指示信号,所述指示信号用于指示所述播放设备对所述原始音频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。
在一种可能的设计中,在所述通过无线传输方式向无线耳机发送的待呈现音频信号之前,还包括:
通过所述无线传输方式接收所述无线耳机的性能参数,并根据所述性能参数确定指示信号,所述指示信号用于指示所述播放设备对所述原始音频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。
在一种可能的设计中,所述通过所述无线传输方式接收所述无线耳机的性能参数,并根据所述性能参数确定所述指示信号,包括:
获取所述原始音频信号的特性参数,所述特性参数包括:码流格式、声道参数、对象参数以及场景成分参数中的至少一种;
根据所述特性参数以及所述性能参数确定所述指示信号。
在一种可能的设计中,所述指示信号包括标识码;
其中,若所述标识码为第一字段,则所述播放设备未对所述原始音频信号进行渲染,则所述待呈现音频信号包括所述第二音频信号,未包括所述第一音频信号,所述无线耳机对所述原始音频信号进行全部渲染;
若所述标识码为第二字段,则所述播放设备对所述原始音频信号进行全部渲染,则所述待呈现音频信号包括所述第一音频信号,未包括所述第二音频信号,所述无线耳机未对所述原始音频信号进行渲染;
若所述标识码为第三字段,则所述播放设备对所述原始音频信号进行部分渲染,则所述包括所述第一音频信号和所述第二音频信号,所述无线耳机对所述原始音频信号剩余部分进行渲染。
可选的,所述原始音频信号包括第四音频信号和/或第五音频信号,其中,所述第四音频信号用于处理后生成所述第一音频信号,所述第五音频信号用于生成所述第二音频信号;
对应的,在所述获取原始音频信号之后,还包括:
对所述第四音频信号进行解码处理,以获得第六音频信号,所述第六音频信号包括第七音频信号和/或第八音频信号;
对所述第七音频信号进行渲染处理,以获取第九音频信号;
对所述第八音频信号以及所述第九音频信号进行编码,以获取第十音频信号,所述待呈现音频信号包括所述第五音频信号以及所述第十音频信号。
在一种可能的设计中,所述对所述第七音频信号进行渲染处理,包括:
根据渲染元数据对所述第七音频信号进行渲染处理,以获得所述第九音 频信号,其中,所述渲染元数据包括第一元数据以及第二元数据,所述第一元数据为所述播放设备端的元数据,所述第二元数据为无线耳机端的元数据。
在一种可能的设计中,所述第一元数据包括耳机传感器元数据,其中,所述耳机传感器元数据用于表征所述播放设备的运动特征;和/或,
所述第二元数据包括播放设备传感器元数据以及头相关变换函数HRTF数据库,其中,所述传感器元数据用于表征所述无线耳机的运动特征。
在一种可能的设计中,所述耳机传感器元数据通过耳机传感器获得,所述耳机传感器包括陀螺仪传感器、头部大小传感器、测距传感器、地磁传感器以及加速度传感器中的至少一种;和/或,
所述播放设备传感器元数据通过播放设备传感器获得,所述播放设备传感器包括陀螺仪传感器、头部大小传感器、测距传感器、地磁传感器以及加速度传感器中的至少一种。
可选的,所述待呈现音频信号包括基于声道的音频信号、基于对象的音频信号、基于场景的音频信号中的至少一种。
可选的,所述渲染处理包括:双耳虚拟渲染、声道信号渲染、对象信号渲染以及场景信号渲染中的至少一种。
可选的,所述无线传输方式包括:蓝牙通信、红外线通信、WIFI通信、LIFI可见光通信。
第三方面,本申请提供一种音频处理装置,包括:
获取模块,用于通过无线传输方式接收播放设备发送的待呈现音频信号,所述待呈现音频信号包括第一音频信号和/或第二音频信号,其中,所述第一音频信号为在所述播放设备渲染处理后的音频信号,所述第二音频信号为待渲染的音频信号;
渲染模块,用于在所述待呈现音频信号包括所述第二音频信号时,则对所述第二音频信号进行渲染处理,以获得第三音频信号;
播放模块,用于根据所述第一音频信号和/或所述第三音频信号进行后续音频播放。
在一种可能的设计中,所述接收模块,用于通过无线传输方式接收播放设备发送的待呈现音频信号之前,还包括:
发送模块,用于通过无线传输方式向所述播放设备发送指示信号,所述指示信号用于指示所述播放设备对原始音频信号按照对应的预设处理方式进 行渲染,以获取所述待呈现音频信号。
在一种可能的设计中,所述发送模块,用于通过无线传输方式向所述播放设备发送指示信号之前,还包括:
所述获取模块,还用于获取所述无线耳机的性能参数,并根据所述性能参数确定所述指示信号。
在一种可能的设计中,所述发送模块,用于通过无线传输方式向所述播放设备发送指示信号之前,还包括:
所述获取模块,还用于接收所述播放设备发送的音频特性信息,所述音频特性信息包括输入至所述播放设备的所述原始音频信号的特性参数,所述特性参数包括:码流格式、声道参数、对象参数以及场景成分参数中的至少一种。
在一种可能的设计中,所述指示信号包括标识码;
其中,若所述标识码为第一字段,则所述播放设备未对所述原始音频信号进行渲染,所述待呈现音频信号包括所述第二音频信号,未包括所述第一音频信号,所述音频处理装置对所述原始音频信号进行全部渲染;
若所述标识码为第二字段,则所述播放设备对所述原始音频信号进行全部渲染,所述待呈现音频信号包括所述第一音频信号,未包括所述第二音频信号,所述音频处理装置未对所述原始音频信号进行渲染;
若所述标识码为第三字段,则所述播放设备对所述原始音频信号进行部分渲染,所述包括所述第一音频信号和所述第二音频信号,所述音频处理装置对所述原始音频信号剩余部分进行渲染。
在一种可能的设计中,其特征在于,在所述获取模块,用于通过无线传输方式接收播放设备发送的待呈现音频信号之后,还包括:
解码模块,用于对所述待呈现音频信号进行解码处理,以获得所述第一音频信号和/或所述第二音频信号。
在一种可能的设计中,所述渲染模块,用于对所述第二音频信号进行渲染处理,以获得第三音频信号,包括:
所述渲染模块,用于根据渲染元数据对所述第二音频信号进行渲染处理,以获得所述第三音频信号,其中,所述渲染元数据包括第一元数据以及第二元数据,所述第一元数据为所述播放设备端的元数据,所述第二元数据为无线耳机端的元数据。
在一种可能的设计中,所述第一元数据包括第一传感模块元数据,其中,所述第一传感模块元数据用于表征所述播放设备的运动特征;和/或,
所述第二元数据包括第二传感模块元数据以及头相关变换函数HRTF数据库,其中,所述第二传感模块元数据用于表征所述无线耳机的运动特征。
在一种可能的设计中,所述耳机传感器元数据通过第一传感模块获得,所述第一传感模块包括陀螺仪传感子模块、头部大小传感子模块、测距传感子模块、地磁传感子模块以及加速度传感子模块中的至少一种;和/或,
所述播放设备传感器元数据通过第二传感模块获得,所述第二传感模块包括陀螺仪传感子模块、头部大小传感子模块、测距传感子模块、地磁传感子模块以及加速度传感子模块中的至少一种。
在一种可能的设计中,所述音频处理装置包括第一音频处理装置以及第二音频处理装置;
所述第一音频处理装置或所述第二音频处理装置中设置有所述第二传感子模块;或者,
所述第一音频处理装置与所述第二音频处理装置中均设置有所述第二传感子模块,在所述第一音频处理装置的获取模块与所述第二音频处理装置的获取模块,用于获取到所述播放设备传感器元数据之后,还包括:
同步模块,用于对所述播放设备传感器元数据进行相互同步。
在一种可能的设计中,所述第一音频处理装置包括:
第一接收模块,用于接收播放设备发送的第一待呈现音频信号;
第一渲染模块,用于对所述第一待呈现音频信号进行渲染处理,以获取第一播放音频信号;
第一播放模块,用于播放所述第一播放音频信号;
所述第二音频处理装置包括:
第二接收模块,用于接收所述播放设备发送的第二待呈现音频信号;
第二渲染模块,用于对所述第二待呈现音频信号进行渲染处理,以获取第二播放音频信号;
第二播放模块,用于播放所述第二播放音频信号。
在一种可能的设计中,所述第一音频处理装置,还包括:
第一解码模块,用于对所述第一待呈现音频信号进行解码处理,以获取第一解码音频信号;
所述第一渲染模块,具体用于:根据所述第一解码音频信号以及渲染元数据进行渲染处理,以获取所述第一播放音频信号;
所述第二音频处理装置,还包括:
第二解码模块,用于对所述第二待呈现音频信号进行解码处理,以获取第二解码音频信号;
所述第二渲染模块,具体用于:根据所述第二解码音频信号以及渲染元数据进行渲染处理,以获取所述第二播放音频信号。
在一种可能的设计中,所述渲染元数据包括第一无线耳机元数据、第二无线耳机元数据以及播放设备元数据中的至少一种。
在一种可能的设计中,所述第一无线耳机元数据包括第一耳机传感器元数据以及头相关变换函数HRTF数据库,其中,所述第一耳机传感器元数据用于表征所述第一无线耳机的运动特征;
所述第二无线耳机元数据包括第二耳机传感器元数据以及头相关变换函数HRTF数据库,其中,所述第二耳机传感器元数据用于表征所述第二无线耳机的运动特征;
所述播放设备元数据包括播放设备传感器元数据,其中,所述播放设备传感器元数据用于表征所述播放设备的运动特征。
在一种可能的设计中,所述第一音频处理装置,还包括:
第一同步模块,用于与所述第二无线耳机同步所述渲染元数据;和/或,
所述第二音频处理装置,还包括:
第二同步模块,用于与所述第一无线耳机同步所述渲染元数据。
在一种可能的设计中,所述第一同步模块,具体用于:将所述第一耳机传感器元数据发送至所述第二无线耳机,以所属所述第二同步模块将所述第一耳机传感器元数据作为所述第二耳机传感器元数据。
在一种可能的设计中,所述第一同步模块,具体用于:
发送所述第一耳机传感器元数据;
接收所述第二耳机传感器元数据;
根据所述第一耳机传感器元数据、所述第二耳机传感器元数据以及预设数值算法确定所述渲染元数据;
所述第二同步模块,具体用于:
发送所述第二耳机传感器元数据;
接收所述第一耳机传感器元数据;
根据所述第一耳机传感器元数据、所述第二耳机传感器元数据以及预设数值算法确定所述渲染元数据;或者,
所述第一同步模块,具体用于:
发送所述第一耳机传感器元数据;
接收所述渲染元数据;
所述第二同步模块,具体用于:
发送所述第二耳机传感器元数据;
接收所述渲染元数据。
在一种可能的设计中,所述第一同步模块,具体用于:
接收播放设备传感器元数据;
根据所述第一耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据;
发送所述渲染元数据。
在一种可能的设计中,所述第一同步模块,具体用于:
发送所述第一耳机传感器元数据;
接收所述第二耳机传感器元数据;
接收所述播放设备传感器元数据;
根据所述第一耳机传感器元数据、所述第二耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据;
所述第二同步模块,具体用于:
发送所述第二耳机传感器元数据;
接收所述第一耳机传感器元数据;
接收所述播放设备传感器元数据;
根据所述第一耳机传感器元数据、所述第二耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据。
可选的,所述待呈现音频信号包括基于声道的音频信号、基于对象的音频信号、基于场景的音频信号中的至少一种。
可选的,所述渲染处理包括:双耳虚拟渲染、声道信号渲染、对象信号渲染以及场景信号渲染中的至少一种。
可选的,所述无线传输方式包括:蓝牙通信、红外线通信、WIFI通信、 LIFI可见光通信。
第四方面,本申请提供的另一种音频处理装置,包括:
获取模块,用于接收原始音频信号,并根据所述原始音频信号生成待呈现音频信号,所述待呈现音频信号包括第一音频信号和/或第二音频信号,其中,所述第一音频信号为在播放设备渲染处理后的音频信号,所述第二音频信号为待渲染的音频信号;
发送模块,用于通过无线传输方式向无线耳机发送的待呈现音频信号。
在一种可能的设计中,在所述发送模块,用于通过无线传输方式向无线耳机发送的待呈现音频信号之前,包括:
所述获取模块,还用于通过所述无线传输方式接收所述无线耳机发送的指示信号,所述指示信号用于指示所述播放设备对所述原始音频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。
在一种可能的设计中,在所述发送模块,用于通过无线传输方式向无线耳机发送的待呈现音频信号之前,还包括:
所述获取模块,还用于通过所述无线传输方式接收所述无线耳机的性能参数,并根据所述性能参数确定指示信号,所述指示信号用于指示所述播放设备对所述原始音频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。
在一种可能的设计中,所述获取模块,还用于通过所述无线传输方式接收所述无线耳机的性能参数,并根据所述性能参数确定指示信号,包括:
所述获取模块,还用于获取所述原始音频信号的特性参数,所述特性参数包括:码流格式、声道参数、对象参数以及场景成分参数中的至少一种;
所述获取模块,还用于根据所述特性参数以及所述性能参数确定所述指示信号。
可选的,所述指示信号包括标识码;
其中,若所述标识码为第一字段,则所述播放设备未对所述原始音频信号进行渲染,则所述待呈现音频信号包括所述第二音频信号,未包括所述第一音频信号,所述音频处理装置对所述原始音频信号进行全部渲染;
若所述标识码为第二字段,则所述播放设备对所述原始音频信号进行全部渲染,则所述待呈现音频信号包括所述第一音频信号,未包括所述第二音频信号,所述音频处理装置未对所述原始音频信号进行渲染;
若所述标识码为第三字段,则所述播放设备对所述原始音频信号进行部分渲染,则所述包括所述第一音频信号和所述第二音频信号,所述音频处理装置对所述原始音频信号剩余部分进行渲染。
可选的,所述原始音频信号包括第四音频信号和/或第五音频信号,其中,所述第四音频信号用于处理后生成所述第一音频信号,所述第五音频信号用于生成所述第二音频信号;
对应的,在所述获取模块,用于获取原始音频信号之后,还包括:
解码模块,用于对所述第四音频信号进行解码处理,以获得第六音频信号,所述第六音频信号包括第七音频信号和/或第八音频信号;
渲染模块,用于对所述第七音频信号进行渲染处理,以获取第九音频信号;
编码模块,用于对所述第八音频信号以及所述第九音频信号进行编码,以获取第十音频信号,所述待呈现音频信号包括所述第五音频信号以及所述第十音频信号。
在一种可能的设计中,所述渲染模块,用于对所述第七音频信号进行渲染处理,包括:
所述渲染模块,用于根据渲染元数据对所述第七音频信号进行渲染处理,以获得所述第九音频信号,其中,所述渲染元数据包括第一元数据以及第二元数据,所述第一元数据为所述播放设备端的元数据,所述第二元数据为无线耳机端的元数据。
在一种可能的设计中,所述第一元数据包括第一传感子模块元数据,其中,所述第一传感子模块元数据用于表征所述播放设备的运动特征;和/或,
所述第二元数据包括第二传感子模块元数据以及头相关变换函数HRTF数据库,其中,所述传感子模块元数据用于表征所述无线耳机的运动特征。
在一种可能的设计中,所述第一传感子模块元数据通过第一传感子模块获得,所述第一传感子模块包括陀螺仪传感子模块、头部大小传感子模块、测距传感子模块、地磁传感子模块以及加速度传感子模块中的至少一种;和/或,
所述第二传感子模块元数据通过第二传感子模块获得,所述第二传感子模块包括陀螺仪传感子模块、头部大小传感子模块、测距传感子模块、地磁传感子模块以及加速度传感子模块中的至少一种。
可选的,所述待呈现音频信号包括基于声道的音频信号、基于对象的音频信号、基于场景的音频信号中的至少一种。
可选的,所述渲染处理包括:双耳虚拟渲染、声道信号渲染、对象信号渲染以及场景信号渲染中的至少一种。
可选的,其特征在于,所述无线传输方式包括:蓝牙通信、红外线通信、WIFI通信、LIFI可见光通信。
第五方面,本申请还提供一种无线耳机,包括:
处理器;以及
存储器,用于存储所述处理器的计算机程序;
其中,所述处理器被配置为通过执行所述计算机程序来实现上述第一方面中任意一项可能的音频处理方法。
第六方面,本申请还提供一种播放设备,包括:
处理器;以及
存储器,用于存储所述处理器的计算机程序;
其中,所述处理器被配置为通过执行所述计算机程序来实现上述第二方面中任意一项可能的音频处理方法。
第七方面,本申请还提供一种存储介质,所述可读存储介质中存储有计算机程序,所述计算机程序用于执行第一方面所提供的任意一种可能的音频处理方法。
第八方面,本申请还提供一种存储介质,所述可读存储介质中存储有计算机程序,所述计算机程序用于执行第二方面所提供的任意一种可能的音频处理方法。
第九方面,本申请还提供一种系统,包括第五方面的无线耳机和第六方面的播放设备。
本申请提供一种音频处理方法、装置、系统以及存储介质,首先无线耳机端通过无线传输方式接收播放设备发送的待呈现音频信号,待呈现音频信号包括播放设备渲染处理后的音频信号即第一音频信号以及待渲染的音频信号即第二音频信号;然后若待呈现音频信号包括第二音频信号,则无线耳机端对第二音频信号进行渲染处理,以获得第三音频信号;最后无线耳机端根据第一音频信号和/或第三音频信号进行后续音频播放。从而实现无线耳机能够呈现高品质环绕声和全景声效果的技术效果。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本申请根据一示例性实施例示出的一种无线耳机的结构示意图;
图2为本申请根据一示例性实施例示出的一种音频处理方法的应用场景示意图;
图3为本申请根据一示例性实施例示出的音频处理方法的流程示意图;
图4为本申请实施例提供的一种音频数据渲染模块所含渲染方式示意图;
图5为本申请实施例提供的一种HRTF渲染方法的流程示意图;
图6为本申请实施例提供的另一种HRTF渲染方法的流程示意图;
图7为本申请实施例提供的无线耳机端进行音频信号渲染的数据流示意图;
图8为本申请实施例提供的另一种音频处理方法的流程示意图;
图9为本申请实施例提供的音频处理信号在播放设备和无线耳机中的数据链路示意图;
图10为本申请实施例提供的又一种音频处理方法的流程示意图;
图11为本申请实施例提供的TWS真无线耳机关于声道信息的渲染过程示意图;
图12为本申请实施例提供的一种音频处理装置的结构示意图;
图13为本申请实施例提供的另一种音频处理装置的结构示意图;
图14为本申请提供的一种无线耳机的结构示意图;
图15为本申请提供的另一种播放设备的结构示意图。
通过上述附图,已示出本申请明确的实施例,后文中将有更详细的描述。这些附图和文字描述并不是为了通过任何方式限制本申请构思的范围,而是通过参考特定实施例为本领域技术人员说明本申请的概念。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,包括但不限于对多个实施例的组合,都属于本申请保护的范围。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例例如能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
下面以具体地实施例对本申请的技术方案以及本申请的技术方案如何解决上述技术问题进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例中不再赘述。下面将结合附图,对本申请的实施例进行描述。
图1为本申请根据一示例性实施例示出的一种无线耳机的结构示意图,图2为本申请根据一示例性实施例示出的一种音频处理方法的应用场景示意图。如图1-图2所示,本实施例提供的无线收发设备组通信方法应用于无线耳机10,其中,该无线耳机10包括第一无线耳机101以及第二无线耳机102,并且无线耳机10内的无线收发设备之间通过第一无线链路103进行通信连接,值得说明地,无线耳机10内的无线耳机101和无线耳机102之间的通信连接可以为双向,也可以为单向,在本实施例中不做具体限定。此外,值得理解地,对于上述的无线耳机10与播放设备20可以是根据标准无线协议进行通信的无线收发设备,其中,该标准无线协议可以为蓝牙协议、Wifi协议、Lifi协议、红外线无线传输协议等等,在本实施例中,并不对其无线协议的具体形式进行限定。为了能够对本实施例提供的无线连接方法的应用场景进行具体的说明,可以以标准无线协议可以为蓝牙协议进行举例说明,此处,无线 耳机10则可以为TWS(True Wireless Stereo)真无线耳机,或者是传统蓝牙耳机等。
图3为本申请根据一示例性实施例示出的音频处理方法的流程示意图。如图3所示,本实施例提供的音频处理方法,包括:
S301、获取原始音频信号,根据原始音频信号生成待呈现音频信号。
在本步骤中,播放设备获取原始音频信号,并对原始音频信号进行预处理,可以包括解码、渲染、再编码等至少一个预处理程序。
可选的,当播放设备获取到原始音频信号后,就可以对全部或者部分原始音频信号进行解码,得到音频内容数据和音频特性信息,所述音频内容数据可以包含但不限于声道内容音频信号;所述音频特性信息可以包含但不限于声场类型,采样率,比特率信息等。
原始音频信号包括基于声道的音频信号,例如AAC/AC3码流等、基于对象的音频信号,例如ATMOS/MPEG-H码流等,基于场景的音频信号,例如MPEG-H HOA码流,或上述3种任意组合的音频信号,例如WANOS码流。
当原始音频信号是基于声道的音频信号时,例如AAC/AC3码流等,对音频码流全解码,得到各声道的音频内容信号,以及声道特性信息例如:声场类型,采样率,比特率等。
当原始音频信号是基于对象的音频信号时,例如ATMOS/MPEG-H码流等,只对音频音床进行解码,得到各声道的音频内容信号,以及声道特性信息,例如声场类型,采样率,比特率等。
当原始音频信号是基于场景的音频信号时,例如MPEG-H HOA码流,对音频码流全解码,得到各声道的音频内容信号,以及声道特性信息,例如声场类型,采样率,比特率等。
当原始音频信号是基于上述三种信号的码流时,例如WANOS码流,对音频码流按对上述三种信号的码流解码描述进行解码,得到各声道的音频内容信号,以及声道特性信息,例如声场类型,采样率,比特率等。
可选的,播放设备可以对解码后的音频内容数据进行渲染处理,得到渲染后的音频信号和元数据。所述音频内容可以包含但不限于声道的音频内容信号和对象的音频内容信号;所述元数据可以包含但不限于声道特性信息,如声场类型、采样率、比特率等,和对象的三维空间信息,以及无线耳机的渲染元数据,例如可以包含但不限于传感器元数据和HRTF(Head Related  Transfer Function头相关变换函数)数据库。
图4为本申请实施例提供的一种音频数据渲染模块所含渲染方式示意图。在本实施例中,所述渲染方式,如图4所示,包含但不限于以下渲染方式的任意组合:HRTF渲染、声道渲染、对象渲染、场景渲染等。
图5为本申请实施例提供的一种HRTF渲染方法的流程示意图。如图5所示,当解码后的音频信号是声道信号时,该渲染方法的具体步骤包括:
S501、获取基于声道的音频信号和基本元数据。
在本步骤中,声道的音频信号为声道的内容信号,其包括声道数,基本元数据是声道的基本信息,包含声场类型,采样率等信息。
S502、基于基本元数据构建各声道的空间位置分布(X1,Y1,Z1)。
在本步骤中,利用基本元数据,根据预设算法来构建各声道的空间分布。
S503、接收到渲染元数据后,对各个声道空间分布进行旋转变换,得到新坐标系下的空间分布(X2,Y2,Z2),并换算成以人头为中心的空间极坐标(ρ1,α1,β1)。
在本步骤中,接收渲染元数据中来自传感器的传感器元数据,对各个声道空间分布进行旋转变换。具体坐标换算方法,根据一般笛卡尔坐标系和极坐标系的转换方法进行计算即可,此处不再赘述。
S504、基于极坐标,从HRTF数据库中选出对应角度的滤波器系数HRTF(i),对基于声道的音频信号进行滤波,得到滤波后的音频数据。
在本步骤中,根据极坐标(ρ1,α1,β1)中,距离与角度信息,从HRTF数据库数据中选出对应的滤波器数组HRTF(i),然后对各声道的音频信号进行滤波。
S505、对滤波后的音频信号进行下混处理,得到HRTF虚拟后的双耳信号。
在本步骤中,对于滤波后的音频信号进行下混处理,就能够得到左右两个无线耳机的音频信号即双耳信号。
需要说明的是,所述传感器元数据可以包含由陀螺仪传感器、地磁装置、加速度计的组合方式进行提供;所述HRTF数据库,可以基于但不限于无线耳机上的其它传感器元数据,例如头部大小传感器,或者基于具有摄像或拍照功能的前端设备进行人头部智能识别后,根据听者头部、耳部等身体特性 后,进行个性化处理和调整,达到个性化效果;所述HRTF数据库,可以是提前存入在无线耳机,也可以后续通过有线或者无线的方式,把新的HRTF数据库导入其中,进行HRTF数据库的更新,达到根据上述个性化的目的。
还需要说明的是,由于HRTF数据库可能精度有限,在进行计算时,可以考虑采用插值的方式,获取对应角度的HRTF数据组;另外,在S505后续可以进一步添加后续处理步骤,包含但不限于均衡(EQ)、延迟、混响等处理。
图6为本申请实施例提供的另一种HRTF渲染方法的流程示意图。如图6所示,当解码后的音频信号是对象信号时,该渲染方法的具体步骤包括:
S601、获取基于对象的音频信号和对象的空间坐标(X3,Y3,Z3)。
S602、接收到渲染元数据后,对各个声道空间分布进行旋转变换,得到新坐标系下的空间分布(X4,Y4,Z4),并换算成以人头为中心的空间极坐标(ρ2,α2,β2)。
S603、基于极坐标,从HRTF数据库中选出对应角度的滤波器系数HRTF(k),对基于对象的音频信号进行滤波,得到滤波后的音频数据。
S604、对滤波后的音频数据进行下混处理,得到HRTF虚拟后的双耳信号。
S601-S604的步骤和名词概念与S501-S505类似,可以对比理解,在此不再赘述。
对于图4所示的声道渲染,播放设备可以对全部或者部分声道音频信号进行渲染处理,其处理方式包括但不限于声道数量的下混(如7.1下混至5.1)、声道维度的下混(如5.1.4下混至5.1)等。
对于图4所示的对象渲染,播放设备可对输入的全部或部分对象音频信号进行渲染处理,根据对象的元数据将对象音频内容渲染至指定位置和指定数量的声道上,使其变为声道音频信号。
对于图4所示的场景渲染,播放设备可对输入的全部或部分场景音频信号进行渲染处理,根据指定的输入声道数和输出声道数,将场景音频信号渲染至指定输出声道上,使其变为声道音频信号。
进一步可选的,播放设备可以对渲染后的音频数据及渲染后的元数据进行再编码,输出编码后的音频码流作为待呈现音频信号通过无线传输给无线耳机。
S302、播放设备通过无线传输方式向无线耳机发送待呈现音频信号。
在本步骤中,所述待呈现音频信号包括第一音频信号和/或第二音频信号,其中,所述第一音频信号为在所述播放设备渲染处理后的音频信号,所述第二音频信号为待渲染的音频信号。
需要说明的是,第一音频信号是已经在播放设备中已经完成了渲染处理的音频信号,而第二音频信号是播放设备没有进行渲染处理的信号,需要耳机进行进一步的渲染处理。
具体的,在一种可能的设计中,若待呈现音频信号中只包含第一音频信号,则无线耳机直接将第一音频信号进行播放。因为在一些高品质的音源数据,比如无损音乐,其本身已经具备了较高的音质或者是已经包含了相应的渲染效果,就无需耳机进行进一步的渲染处理。进一步的,在某些应用场景中,用户使用无线耳机时也很少进行剧烈的头部运动,对渲染的需求不高,则无需无线耳机进行渲染。
在一种可能的设计中,若待成像音频信号中包含有第二音频信号,则无线耳机需要执行S303对第二音频信号进行渲染。
需要说明的是,渲染处理的目的是使得声音能够呈现出立体环绕声的效果以及全景声效果,增加声音的空间感,以及模拟出人对声音获得声音方位感的效果,如能够辨别一辆车的来向或者去向,以及这辆车是高速接近还是远离等效果。
进一步的,在一种可能的设计中,无线耳机通过无线传输方式接受到播放设备发送的待呈现音频信号,当待呈现音频信号为压缩码流时,无线耳机通过对待呈现音频信号进行解码处理,以获得第一音频信号,和/或,第二音频信号。即待呈现音频信号需要经过解码处理才能得到第一音频信号,和/或,第二音频信号。
需要说明的是,解码后的第一音频信号或者第二音频信号包含音频内容数据和音频特性信息,所述音频内容数据可以包含但不限于声道内容音频信号;所述音频特性信息可以包含但不限于声场类型,采样率,比特率信息等。
还需要说明的是,所述无线传输方式包括:蓝牙通信、红外线通信、WIFI通信、LIFI可见光通信。本领域技术人员可以根据实际情况选择具体的无线传输方式,不限于上述的情况,或者选择几种无线传输方式相互组合以达到播放设备与无线耳机进行信息交互的作用。
S303、若待呈现音频信号包括第二音频信号,则对第二音频信号进行渲 染处理,以获得第三音频信号。
在本步骤中,待呈现音频信号包括第二音频信号是指待呈现音频信号中只包含了第二音频信号,或者是待呈现音频信号中既存在第一音频信号,又存在第二音频信号。
图7为本申请实施例提供的无线耳机端进行音频信号渲染的数据流示意图。如图7所示,待呈现音频信号71至少包含第一音频信号721与第二音频信号722中的一个,而第二音频信号722必须经过无线耳机的渲染才能够作为后续播放音频74或者是后续播放音频74的一部分进行播放。
需要说明的是,本实施例中播放设备以及无线耳机的所述渲染处理包括:双耳虚拟渲染、声道信号渲染、对象信号渲染以及场景信号渲染中的至少一种。
当无线耳机为传统无线蓝牙耳机时,即两只耳机通过有线相连接,且共用相关的传感器、处理单元等。此时其渲染方式如下:
所述第二音频信号包含音频内容数据和音频特性信息,对音频内容进行渲染,得到渲染后的音频信号和元数据。所述音频内容可以包含但不限于声道的音频内容信号和对象的音频内容信号;所述元数据可以包含但不限于声道特性信息,如声场类型、采样率、比特率等,以及对象的三维空间信息,以及无线耳机端的渲染元数据,例如可以包含但不限于传感器元数据和HRTF数据库。
具体的渲染过程与播放设备的渲染原理相同,可以参考图5和图6所示的HRTF渲染,以及S302中介绍的播放设备的其它渲染方法。
可选的,所述对所述第二音频信号进行渲染处理,以获得第三音频信号,包括:
根据渲染元数据对所述第二音频信号进行渲染处理,以获得所述第三音频信号,其中,所述渲染元数据包括第一元数据以及第二元数据,所述第一元数据为所述播放设备端的元数据,所述第二元数据为无线耳机端的元数据。
所谓元数据是描述数据属性的信息,第一元数据用来表示播放设备目前所处的运动状态,播放设备的信号传输强度,信号传播方向,播放设备与无线耳机的距离或相对运动状态等;第二元数据用来表示无线耳机的运动状态,比如人的头部在摆动或者晃动,就会引起无线耳机也跟随着运动,并且第二元数据还可以包含左右两个无线耳机的相对运动距离,相对运动速度以及加 速度等信息。第一元数据以及第二元数据共同为实现高质量的环绕音或者全景声音效提供了渲染依据。例如,用户在使用虚拟现实设备进行第一人称射击类游戏时,左右扭头观察的同时还需要倾听是否有敌人靠近,或者是通过附件的枪战声来判定敌人的位置,为了更真实地渲染出此时的环境音,就需要结合无线耳机的第二元数据以及佩戴在用户身上的播放设备或者是放置在房间中的播放设备的第一元数据,提供给无线耳机和/或播放设备,综合起来渲染原始音频数据,以达到逼真,高质量的声音播放效果。
在一个可能的实现方式中,所述第一元数据包括第一传感器元数据,其中,所述第一传感器元数据用于表征所述播放设备的运动特征;和/或,
所述第二元数据包括第二传感器元数据以及头相关变换函数HRTF数据库,其中,所述第二传感器元数据用于表征所述无线耳机的运动特征。
具体的,第一元数据可以是由第一传感器检测到的,第一传感器可以位于播放设备上,也可以位于无线耳机上,或者是用户身上佩戴的其它物体,如智能手环或智能手表上。如图5所示,在播放设备的音频信号渲染阶段,第一元数据即为图5中的传感器元数据,在无线耳机的音频信号渲染阶段,第二传感器元数据即为图5中的传感器元数据,头相干变换函数HRTF数据库即为图5中的HRTF数据库数据。即第一元数据用于播放设备的渲染,第二元数据用于无线耳机的渲染。
可选的,所述第一传感器元数据通过第一传感器获得,所述第一传感器包括陀螺仪传感器、头部大小传感器、测距传感器、地磁传感器以及加速度传感器中的至少一种;和/或,
所述第二传感器元数据通过第二传感器获得,所述第二传感器包括陀螺仪传感器、头部大小传感器、测距传感器、地磁传感器以及加速度传感器中的至少一种。
在一种可能的设计中,所述无线耳机包括第一无线耳机以及第二无线耳机;
所述第一无线耳机或所述第二无线耳机中设置有所述第二传感器;或者,
所述第一无线耳机与所述第二无线耳机中均设置有所述第二传感器,则在所述第一无线耳机与所述第二无线耳机分别获取到所述第二传感器元数据之后,对所述第二传感器元数据进行相互同步。
S304、根据第一音频信号和/或第三音频信号进行后续音频播放。
在本步骤中,无线耳机将第一音频信号和/或第三音频信号进行音频播放,具体为,当仅包含第一音频信号的时候,即播放设备传输的待呈现音频信号无需在无线耳机中进行渲染的部分,直接将其进行播放;当仅包含第三音频信号的时候即播放设备传输的待呈现音频信号都需要在无线耳机中进行渲染,以得到第三音频信号,然后由无线耳机进行播放;当同时包含第一音频信号和第三音频信号的时候,无线耳机需要根据预设的组合算法,将两者进行组合,然后再对组合后的音频信号进行播放。在本申请中,不对组合算法进行限定,本领域技术人员可以根据具体的应用场景选择合适的组合算法实现方式。
本实施例提供一种音频处理方法,首先无线耳机端通过无线传输方式接收播放设备发送的待呈现音频信号,待呈现音频信号包括播放设备渲染处理后的音频信号即第一音频信号以及待渲染的音频信号即第二音频信号;然后若待呈现音频信号包括第二音频信号,则无线耳机端对第二音频信号进行渲染处理,以获得第三音频信号;最后无线耳机端根据第一音频信号和/或第三音频信号进行后续音频播放。从而实现无线耳机能够呈现高品质环绕声和全景声效果的技术效果。
图8为本申请实施例提供的另一种音频处理方法的流程示意图。如图8所示,该方法的具体步骤包括:
S801、获取原始音频信号。
在本步骤中,播放设备从内部存储器、数据库、因特网等资源库中获取原始音频信号。
S802、无线耳机通过无线传输方式向播放设备发送指示信号。
在本步骤中,所述指示信号用于指示所述播放设备对原始音频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。指示信号的作用是用来表明无线耳机的渲染处理能力,例如当无线耳机自身所带电量较为充裕时,其处理能力较强,则在无线耳机与播放设备握手阶段,即建立无线连接阶段,向播放设备指示可以分配较高比例的渲染任务给无线耳机;当无线耳机自身所携带电量较少时,其处理能力较弱,或者说为了维持无线耳机能够更长时间工作,即处于省电模式,此时无线耳机向播放设备指示分配较低比例的渲染任务,或者是不分配渲染任务给无线耳机。
在一种可能的设计中,无线耳机通过无线传输方式发送无线耳机的性能 参数,播放设备接收到无线耳机的性能参数后,通过查询性能参数与指示信号的映射关系表得到指示信号,或者是利用预设算法,根据性能参数,计算出指示信号。
S803、根据指示信号对原始音频信号按照对应的预设处理方式进行渲染,以获取待呈现音频信号。
在一种可能的设计中,所述指示信号包括标识码;
其中,若所述标识码为第一字段,则所述播放设备未对所述原始音频信号进行渲染,所述待呈现音频信号包括所述第二音频信号,未包括所述第一音频信号,所述无线耳机对所述原始音频信号进行全部渲染;
若所述标识码为第二字段,则所述播放设备对所述原始音频信号进行全部渲染,所述待呈现音频信号包括所述第一音频信号,未包括所述第二音频信号,所述无线耳机未对所述原始音频信号进行渲染;
若所述标识码为第三字段,则所述播放设备对所述原始音频信号进行部分渲染,所述包括所述第一音频信号和所述第二音频信号,所述无线耳机对所述原始音频信号剩余部分进行渲染。
指示信息可以在无线耳机与播放设备第一次连接时就从无线耳机发送到播放设备,这样后续就无需消耗播放设备或者无线耳机的处理资源。
可以理解的是,指示信息也可以按周期触发传送,以便于根据播放内容的不同而进行变更,使得无线耳机的音质能够进行动态调节。
指示信息还可以根据无线耳机中的传感器接收到的用户指令来触发传送。
为了便于说明指示信号的作用,下面结合图9来进行说明。
图9为本申请实施例提供的音频处理信号在播放设备和无线耳机中的数据链路示意图。如图9所示,从播放设备获得原始音频信号S0开始,到播放设备输出待呈现信号S3,指示信号的作用就是对原始音频信号S0的数据流向进行指引。
原始音频信号S0包括第四音频信号S01和/或第五音频信号S02,其中,所述第四音频信号S01用于处理后生成所述第一音频信号S40,所述第五音频信号S02用于生成所述第二音频信号S41;
播放设备在所述获取原始音频信号S0之后,对所述第四音频信号S01进行解码处理,以获得第六音频信号S1,所述第六音频信号S1包括第七音频信号S11和/或第八音频信号S12;
对所述第七音频信号S11进行渲染处理,以获取第九音频信号S2;
对所述第八音频信号S12以及所述第九音频信号S2进行编码,以获取第十音频信号S30,所述待呈现音频信号包括所述第五音频信号S02以及所述第十S音频信号S30;
其中,对所述第七音频信号S11进行渲染处理,包括:
根据渲染元数据对所述第七音频信号S11进行渲染处理,以获得所述第九音频信号S2,其中,所述渲染元数据包括第一元数据D3以及第二元数据D5,所述第一元数据D3为所述播放设备端的元数据,所述第二元数据D5为无线耳机端的元数据。
在图9所示的音频信号传输链路中,可以存在多条从原始音频信号至后续播放音频的数据链路,也可以只存在一条数据链路。指示信号和/或原始音频信号决定数据链路的具体使用情况。
S804、播放设备通过无线传输方式向无线耳机发送待呈现音频信号。
S805、若待呈现音频信号包括第二音频信号,则对第二音频信号进行渲染处理,以获得第三音频信号。
S806、根据第一音频信号和/或第三音频信号进行后续音频播放
在本实施例中,步骤S804-S805与图3所示的音频处理方法的S302-S304类似,在此不再赘述。
本实施例提供一种音频处理方法,首先无线耳机端通过无线传输方式接收播放设备发送的待呈现音频信号,待呈现音频信号包括播放设备渲染处理后的音频信号即第一音频信号以及待渲染的音频信号即第二音频信号;然后若待呈现音频信号包括第二音频信号,则无线耳机端对第二音频信号进行渲染处理,以获得第三音频信号;最后无线耳机端根据第一音频信号和/或第三音频信号进行后续音频播放。从而实现无线耳机能够呈现高品质环绕声和全景声效果的技术效果。
图10为本申请实施例提供的又一种音频处理方法的流程示意图。如图10所示,该方法的具体步骤包括:
S1001、获取原始音频信号,根据原始音频信号生成待呈现音频信号。
在本步骤中,播放设备获取原始音频信号,原始音频信号可以包括无损音乐、游戏音频、电影音频等。然后,播放设备对原始音频信号进行解码、渲染以及重编码中至少一个。本步骤S1001可能的实现方式,请参见S803中 关于图9中播放设备部分的数据链路分布情况的描述,在此不再赘述。
S10021、第一无线耳机接收播放设备发送的第一待呈现音频信号。
S10022、第二无线耳机接收播放设备发送的第二待呈现音频信号。
在本实施例中,无线耳机包括第一无线耳机以及第二无线耳机,其中,所述第一无线耳机与所述第二无线耳机用于与播放设备建立无线连接。
需要说明的是,S10021和S10022可以是同时发生的,不限定先后次序。
S10031、第一无线耳机对第一待呈现音频信号进行渲染处理,以获取第一播放音频信号。
S10032、第二无线耳机对第二待呈现音频信号进行渲染处理,以获取第二播放音频信号。
需要说明的是,S10031和S10032可以是同时发生的,不限定先后次序。
可选的,在S1021之前还包括:
所述第一无线耳机对所述第一待呈现音频信号进行解码处理,以获取第一解码音频信号;
对应的,所述第一无线耳机对所述第一待呈现音频信号进行渲染处理,包括:
所述第一无线耳机根据所述第一解码音频信号以及渲染元数据进行渲染处理,以获取所述第一播放音频信号。
在S1022之前,还包括:
所述第二无线耳机对所述第二待呈现音频信号进行解码处理,以获取第二解码音频信号;
对应的,所述第二无线耳机对所述第二待呈现音频信号进行渲染处理,包括:
所述第二无线耳机根据所述第二解码音频信号以及渲染元数据进行渲染处理,以获取所述第二播放音频信号。
可选的,所述渲染元数据包括第一无线耳机元数据、第二无线耳机元数据以及播放设备元数据中的至少一种。
可选的,所述第一无线耳机元数据包括第一耳机传感器元数据以及头相关变换函数HRTF数据库,其中,所述第一耳机传感器元数据用于表征所述第一无线耳机的运动特征;
所述第二无线耳机元数据包括第二耳机传感器元数据以及头相关变换函 数HRTF数据库,其中,所述第二耳机传感器元数据用于表征所述第二无线耳机的运动特征;
所述播放设备元数据包括播放设备传感器元数据,其中,所述播放设备传感器元数据用于表征所述播放设备的运动特征。
可选的,在进行所述渲染处理之前,还包括:
所述第一无线耳机与所述第二无线耳机同步所述渲染元数据。
可选的,若所述第一无线耳机上设置有耳机传感器,所述第二无线耳机上未设置有耳机传感器,所述播放设备上未设置有播放设备传感器,则所述第一无线耳机与所述第二无线耳机同步所述渲染元数据,包括:
所述第一无线耳机将所述第一耳机传感器元数据发送至所述第二无线耳机,所述第二无线耳机将所述第一耳机传感器元数据作为所述第二耳机传感器元数据。
若所述第一无线耳机与所述第二无线耳机上均设置有耳机传感器,所述播放设备上未设置有播放设备传感器,则所述第一无线耳机与所述第二无线耳机同步所述渲染元数据,包括:
所述第一无线耳机将所述第一耳机传感器元数据发送至所述第二无线耳机,所述第二无线耳机将所述第二耳机传感器元数据发送至所述第一无线耳机;
所述第一无线耳机与所述第二无线耳机分别根据所述第一耳机传感器元数据、所述第二耳机传感器元数据以及预设数值算法确定所述渲染元数据;或者,
所述第一无线耳机将所述第一耳机传感器元数据发送至所述播放设备,所述第二无线耳机将所述第二耳机传感器元数据发送至所述播放设备,以使所述播放设备根据所述第一耳机传感器元数据、所述第二耳机传感器元数据以及预设数值算法确定所述渲染元数据;
所述第一无线耳机与所述第二无线耳机分别接收所述渲染元数据。
在一种可能的设计中,所述第一无线耳机上设置有耳机传感器,所述第二无线耳机上未设置有耳机传感器,所述播放设备上设置有播放设备传感器,则所述第一无线耳机与所述第二无线耳机同步所述渲染元数据,包括:
所述第一无线耳机将所述第一耳机传感器元数据发送至所述播放设备,以使播放设备根据所述第一耳机传感器元数据、所述播放设备传感器元数据 以及预设数值算法确定所述渲染元数据;
所述第一无线耳机与所述第二无线耳机分别接收所述渲染元数据;或者,
所述第一无线耳机接收所述播放设备发送的播放设备传感器元数据;
所述第一无线耳机根据所述第一耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据;
所述第一无线耳机将所述渲染元数据发送至所述第二无线耳机。
在另一种可能的设计中,若所述第一无线耳机与所述第二无线耳机上均设置有耳机传感器,所述播放设备上设置有播放设备传感器,则所述第一无线耳机与所述第二无线耳机同步所述渲染元数据,包括:
所述第一无线耳机将所述第一耳机传感器元数据发送至所述播放设备,所述第二无线耳机将所述第二耳机传感器元数据发送至所述播放设备,以使所述播放设备根据所述第一耳机传感器元数据、所述第二耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据;
所述第一无线耳机与所述第二无线耳机分别接收所述渲染元数据;或者,
所述第一无线耳机将所述第一耳机传感器元数据发送至所述第二无线耳机,所述第二无线耳机将所述第二耳机传感器元数据发送至所述第一无线耳机;
所述第一无线耳机与所述第二无线耳机分别接收所述播放设备传感器元数据;
所述第一无线耳机以及所述第二无线耳机分别根据所述第一耳机传感器元数据、所述第二耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据。
具体的,当无线耳机为TWS真无线耳机时,即两只耳机分隔开,通过无线方式耦合,两只耳机可以分别具有各自的处理单元和传感器等。则第一无线耳机为左侧耳机,第二无线耳机为右侧耳机,此时第一无线耳机与第二无线耳机的同步渲染方式如下:
图11为本申请实施例提供的TWS真无线耳机关于声道信息的渲染过程示意图。
S1101~S1110的步骤介绍参考图4所示的HRTF渲染方法,在此不再赘述。需要说明的是,第一无线耳机与第二无线耳机的传感器元数据可以相互配合来调整两个耳机的数据同步,以达到更好的音效的效果。
S10041、所述第一无线耳机播放所述第一播放音频信号。
S10042、所述第二无线耳机播放所述第二播放音频信号。
需要说明的是,S10041和S10042可以是同时发生的,不限定先后次序。
在一种可能的设计中,所述待呈现音频信号包括基于声道的音频信号、基于对象的音频信号、基于场景的音频信号中的至少一种。
需要说明的是,所述渲染处理包括:双耳虚拟渲染、声道信号渲染、对象信号渲染以及场景信号渲染中的至少一种。
还需要说明的是,所述无线传输方式包括:蓝牙通信、红外线通信、WIFI通信、LIFI可见光通信。
此外,在一种可能的设计中,一个播放设备还可以同时与多对无线耳机连接,此时,仍可以参照上述实施例的方式对多对无线耳机进行音频信息的渲染分配,并且可以根据不同的无线耳机的处理能力对应地匹配不同的播放设备和无线耳机的渲染分工比例。可选的,多对无线耳机之间也可以通过由播放设备综合调度各对无线耳机间的渲染处理资源,即对于处理能力较弱的无线耳机,可以调用与相同播放设备连接的其它处理能力强的无线耳机来辅助进行音频信息渲染。
本实施例提供的音频处理方法,通过首先第一无线耳机与第二无线耳机端通过无线传输方式分别对应接收播放设备发送的第一待呈现音频信号和第二待呈现信号,然后再分别对应地进行渲染处理,以获取第一播放音频信号和第二播放音频信号,最后第一无线耳机和第二无线耳机在分别播放对应的播放音频信号。从而实现减少无线耳机与播放设备间渲染数据交互而产生的延迟,提高耳机音效的技术效果。
图12为本申请实施例提供的一种音频处理装置的结构示意图。如图12所示,本实施例提供的音频处理装置1200,包括:
获取模块,用于通过无线传输方式接收播放设备发送的待呈现音频信号,所述待呈现音频信号包括第一音频信号和/或第二音频信号,其中,所述第一音频信号为在所述播放设备渲染处理后的音频信号,所述第二音频信号为待渲染的音频信号;
渲染模块,用于在所述待呈现音频信号包括所述第二音频信号时,则对所述第二音频信号进行渲染处理,以获得第三音频信号;
播放模块,用于根据所述第一音频信号和/或所述第三音频信号进行后续 音频播放。
在一种可能的设计中,所述接收模块,用于通过无线传输方式接收播放设备发送的待呈现音频信号之前,还包括:
发送模块,用于通过无线传输方式向所述播放设备发送指示信号,所述指示信号用于指示所述播放设备对原始音频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。
在一种可能的设计中,所述发送模块,用于通过无线传输方式向所述播放设备发送指示信号之前,还包括:
所述获取模块,还用于获取所述无线耳机的性能参数,并根据所述性能参数确定所述指示信号。
在一种可能的设计中,所述发送模块,用于通过无线传输方式向所述播放设备发送指示信号之前,还包括:
所述获取模块,还用于接收所述播放设备发送的音频特性信息,所述音频特性信息包括输入至所述播放设备的所述原始音频信号的特性参数,所述特性参数包括:码流格式、声道参数、对象参数以及场景成分参数中的至少一种。
在一种可能的设计中,所述指示信号包括标识码;
其中,若所述标识码为第一字段,则所述播放设备未对所述原始音频信号进行渲染,所述待呈现音频信号包括所述第二音频信号,未包括所述第一音频信号,所述无线耳机对所述原始音频信号进行全部渲染;
若所述标识码为第二字段,则所述播放设备对所述原始音频信号进行全部渲染,所述待呈现音频信号包括所述第一音频信号,未包括所述第二音频信号,所述无线耳机未对所述原始音频信号进行渲染;
若所述标识码为第三字段,则所述播放设备对所述原始音频信号进行部分渲染,所述包括所述第一音频信号和所述第二音频信号,所述无线耳机对所述原始音频信号剩余部分进行渲染。
在一种可能的设计中,其特征在于,在所述获取模块,用于通过无线传输方式接收播放设备发送的待呈现音频信号之后,还包括:
解码模块,用于对所述待呈现音频信号进行解码处理,以获得所述第一音频信号和/或所述第二音频信号。
在一种可能的设计中,所述渲染模块,用于对所述第二音频信号进行渲 染处理,以获得第三音频信号,包括:
所述渲染模块,用于根据渲染元数据对所述第二音频信号进行渲染处理,以获得所述第三音频信号,其中,所述渲染元数据包括第一元数据以及第二元数据,所述第一元数据为所述播放设备端的元数据,所述第二元数据为无线耳机端的元数据。
在一种可能的设计中,所述第一元数据包括第一传感模块元数据,其中,所述第一传感模块元数据用于表征所述播放设备的运动特征;和/或,
所述第二元数据包括第二传感模块元数据以及头相关变换函数HRTF数据库,其中,所述第二传感模块元数据用于表征所述无线耳机的运动特征。
在一种可能的设计中,所述耳机传感器元数据通过第一传感模块获得,所述第一传感模块包括陀螺仪传感子模块、头部大小传感子模块、测距传感子模块、地磁传感子模块以及加速度传感子模块中的至少一种;和/或,
所述播放设备传感器元数据通过第二传感模块获得,所述第二传感模块包括陀螺仪传感子模块、头部大小传感子模块、测距传感子模块、地磁传感子模块以及加速度传感子模块中的至少一种。
在一种可能的设计中,所述音频处理装置包括第一音频处理装置以及第二音频处理装置;
所述第一音频处理装置或所述第二音频处理装置中设置有所述第二传感子模块;或者,
所述第一音频处理装置与所述第二音频处理装置中均设置有所述第二传感子模块,在所述第一音频处理装置的获取模块与所述第二音频处理装置的获取模块,用于获取到所述播放设备传感器元数据之后,还包括:
同步模块,用于对所述播放设备传感器元数据进行相互同步。
在一种可能的设计中,所述第一音频处理装置包括:
第一接收模块,用于接收播放设备发送的第一待呈现音频信号;
第一渲染模块,用于对所述第一待呈现音频信号进行渲染处理,以获取第一播放音频信号;
第一播放模块,用于播放所述第一播放音频信号;
所述第二音频处理装置包括:
第二接收模块,用于接收所述播放设备发送的第二待呈现音频信号;
第二渲染模块,用于对所述第二待呈现音频信号进行渲染处理,以获取 第二播放音频信号;
第二播放模块,用于播放所述第二播放音频信号。
在一种可能的设计中,所述第一音频处理装置,还包括:
第一解码模块,用于对所述第一待呈现音频信号进行解码处理,以获取第一解码音频信号;
所述第一渲染模块,具体用于:根据所述第一解码音频信号以及渲染元数据进行渲染处理,以获取所述第一播放音频信号;
所述第二音频处理装置,还包括:
第二解码模块,用于对所述第二待呈现音频信号进行解码处理,以获取第二解码音频信号;
所述第二渲染模块,具体用于:根据所述第二解码音频信号以及渲染元数据进行渲染处理,以获取所述第二播放音频信号。
在一种可能的设计中,所述渲染元数据包括第一无线耳机元数据、第二无线耳机元数据以及播放设备元数据中的至少一种。
在一种可能的设计中,所述第一无线耳机元数据包括第一耳机传感器元数据以及头相关变换函数HRTF数据库,其中,所述第一耳机传感器元数据用于表征所述第一无线耳机的运动特征;
所述第二无线耳机元数据包括第二耳机传感器元数据以及头相关变换函数HRTF数据库,其中,所述第二耳机传感器元数据用于表征所述第二无线耳机的运动特征;
所述播放设备元数据包括播放设备传感器元数据,其中,所述播放设备传感器元数据用于表征所述播放设备的运动特征。
在一种可能的设计中,所述第一音频处理装置,还包括:
第一同步模块,用于与所述第二无线耳机同步所述渲染元数据;和/或,
所述第二音频处理装置,还包括:
第二同步模块,用于与所述第一无线耳机同步所述渲染元数据。
在一种可能的设计中,所述第一同步模块,具体用于:将所述第一耳机传感器元数据发送至所述第二无线耳机,以所属所述第二同步模块将所述第一耳机传感器元数据作为所述第二耳机传感器元数据。
在一种可能的设计中,所述第一同步模块,具体用于:
发送所述第一耳机传感器元数据;
接收所述第二耳机传感器元数据;
根据所述第一耳机传感器元数据、所述第二耳机传感器元数据以及预设数值算法确定所述渲染元数据;
所述第二同步模块,具体用于:
发送所述第二耳机传感器元数据;
接收所述第一耳机传感器元数据;
根据所述第一耳机传感器元数据、所述第二耳机传感器元数据以及预设数值算法确定所述渲染元数据;或者,
所述第一同步模块,具体用于:
发送所述第一耳机传感器元数据;
接收所述渲染元数据;
所述第二同步模块,具体用于:
发送所述第二耳机传感器元数据;
接收所述渲染元数据。
在一种可能的设计中,所述第一同步模块,具体用于:
接收播放设备传感器元数据;
根据所述第一耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据;
发送所述渲染元数据。
在一种可能的设计中,所述第一同步模块,具体用于:
发送所述第一耳机传感器元数据;
接收所述第二耳机传感器元数据;
接收所述播放设备传感器元数据;
根据所述第一耳机传感器元数据、所述第二耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据;
所述第二同步模块,具体用于:
发送所述第二耳机传感器元数据;
接收所述第一耳机传感器元数据;
接收所述播放设备传感器元数据;
根据所述第一耳机传感器元数据、所述第二耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据。
可选的,所述待呈现音频信号包括基于声道的音频信号、基于对象的音频信号、基于场景的音频信号中的至少一种。
可选的,所述渲染处理包括:双耳虚拟渲染、声道信号渲染、对象信号渲染以及场景信号渲染中的至少一种。
可选的,所述无线传输方式包括:蓝牙通信、红外线通信、WIFI通信、LIFI可见光通信。
值得说明的是,图12所示实施例提供的音频处理装置,可以执行上述任一方法实施例所提供的无线耳机端对应的方法,其具体实现原理、技术特征、专业名词解释以及技术效果类似,在此不再赘述。
图13为本申请实施例提供的另一种音频处理装置的结构示意图。如图13所示,本实施例提供的音频处理装置1300,包括:
获取模块,用于接收原始音频信号,并根据所述原始音频信号生成待呈现音频信号,所述待呈现音频信号包括第一音频信号和/或第二音频信号,其中,所述第一音频信号为在播放设备渲染处理后的音频信号,所述第二音频信号为待渲染的音频信号;
发送模块,用于通过无线传输方式向无线耳机发送的待呈现音频信号。
在一种可能的设计中,在所述发送模块,用于通过无线传输方式向无线耳机发送的待呈现音频信号之前,包括:
所述获取模块,还用于通过所述无线传输方式接收所述无线耳机发送的指示信号,所述指示信号用于指示所述播放设备对所述原始音频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。
在一种可能的设计中,在所述发送模块,用于通过无线传输方式向无线耳机发送的待呈现音频信号之前,还包括:
所述获取模块,还用于通过所述无线传输方式接收所述无线耳机的性能参数,并根据所述性能参数确定指示信号,所述指示信号用于指示所述播放设备对所述原始音频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。
在一种可能的设计中,所述获取模块,还用于通过所述无线传输方式接收所述无线耳机的性能参数,并根据所述性能参数确定指示信号,包括:
所述获取模块,还用于获取所述原始音频信号的特性参数,所述特性参数包括:码流格式、声道参数、对象参数以及场景成分参数中的至少一种;
所述获取模块,还用于根据所述特性参数以及所述性能参数确定所述指示信号。
可选的,所述指示信号包括标识码;
其中,若所述标识码为第一字段,则所述播放设备未对所述原始音频信号进行渲染,则所述待呈现音频信号包括所述第二音频信号,未包括所述第一音频信号,所述无线耳机对所述原始音频信号进行全部渲染;
若所述标识码为第二字段,则所述播放设备对所述原始音频信号进行全部渲染,则所述待呈现音频信号包括所述第一音频信号,未包括所述第二音频信号,所述无线耳机未对所述原始音频信号进行渲染;
若所述标识码为第三字段,则所述播放设备对所述原始音频信号进行部分渲染,则所述包括所述第一音频信号和所述第二音频信号,所述无线耳机对所述原始音频信号剩余部分进行渲染。
可选的,所述原始音频信号包括第四音频信号和/或第五音频信号,其中,所述第四音频信号用于处理后生成所述第一音频信号,所述第五音频信号用于生成所述第二音频信号;
对应的,在所述获取模块,用于获取原始音频信号之后,还包括:
解码模块,用于对所述第四音频信号进行解码处理,以获得第六音频信号,所述第六音频信号包括第七音频信号和/或第八音频信号;
渲染模块,用于对所述第七音频信号进行渲染处理,以获取第九音频信号;
编码模块,用于对所述第八音频信号以及所述第九音频信号进行编码,以获取第十音频信号,所述待呈现音频信号包括所述第五音频信号以及所述第十音频信号。
在一种可能的设计中,所述渲染模块,用于对所述第七音频信号进行渲染处理,包括:
所述渲染模块,用于根据渲染元数据对所述第七音频信号进行渲染处理,以获得所述第九音频信号,其中,所述渲染元数据包括第一元数据以及第二元数据,所述第一元数据为所述播放设备端的元数据,所述第二元数据为无线耳机端的元数据。
在一种可能的设计中,所述第一元数据包括第一传感子模块元数据,其中,所述第一传感子模块元数据用于表征所述播放设备的运动特征;和/或,
所述第二元数据包括第二传感子模块元数据以及头相关变换函数HRTF数据库,其中,所述传感子模块元数据用于表征所述无线耳机的运动特征。
在一种可能的设计中,所述第一传感子模块元数据通过第一传感子模块获得,所述第一传感子模块包括陀螺仪传感子模块、头部大小传感子模块、测距传感子模块、地磁传感子模块以及加速度传感子模块中的至少一种;和/或,
所述第二传感子模块元数据通过第二传感子模块获得,所述第二传感子模块包括陀螺仪传感子模块、头部大小传感子模块、测距传感子模块、地磁传感子模块以及加速度传感子模块中的至少一种。
可选的,所述待呈现音频信号包括基于声道的音频信号、基于对象的音频信号、基于场景的音频信号中的至少一种。
可选的,所述渲染处理包括:双耳虚拟渲染、声道信号渲染、对象信号渲染以及场景信号渲染中的至少一种。
可选的,其特征在于,所述无线传输方式包括:蓝牙通信、红外线通信、WIFI通信、LIFI可见光通信。
值得说明的是,图13所示实施例提供的音频处理装置,可以执行上述任一方法实施例所提供的播放设备端对应的方法,其具体实现原理、技术特征、专业名词解释以及技术效果类似,在此不再赘述。
图14为本申请提供的一种无线耳机的结构示意图。如图14所示,该电子设备1400可以包括:至少一个处理器1401和存储器1402。图14示出的是以一个处理器为例的电子设备。
存储器1402,用于存放程序。具体地,程序可以包括程序代码,程序代码包括计算机操作指令。
存储器1402可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。
处理器1401用于执行存储器1402存储的计算机执行指令,以实现以上各方法实施例所述的无线耳机端所对应的方法。
其中,处理器1401可能是一个中央处理器(central processing unit,简称为CPU),或者是特定集成电路(application specific integrated circuit,简称为ASIC),或者是被配置成实施本申请实施例的一个或多个集成电路。
可选地,存储器1402既可以是独立的,也可以跟处理器1401集成在一 起。当所述存储器1402是独立于处理器1401之外的器件时,所述电子设备1400,还可以包括:
总线1403,用于连接所述处理器1401以及所述存储器1402。总线可以是工业标准体系结构(industry standard architecture,简称为ISA)总线、外部设备互连(peripheral component,PCI)总线或扩展工业标准体系结构(extended industry standard architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等,但并不表示仅有一根总线或一种类型的总线。
可选的,在具体实现上,如果存储器1402和处理器1401集成在一块芯片上实现,则存储器1402和处理器1401可以通过内部接口完成通信。
图15为本申请提供的另一种播放设备的结构示意图。如图15所示,该电子设备1500可以包括:至少一个处理器1501和存储器1502。图15示出的是以一个处理器为例的电子设备。
存储器1502,用于存放程序。具体地,程序可以包括程序代码,程序代码包括计算机操作指令。
存储器1502可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。
处理器1501用于执行存储器502存储的计算机执行指令,以实现以上各方法实施例所述的播放设备端所对应的方法。
其中,处理器1501可能是一个中央处理器(central processing unit,简称为CPU),或者是特定集成电路(application specific integrated circuit,简称为ASIC),或者是被配置成实施本申请实施例的一个或多个集成电路。
可选地,存储器1502既可以是独立的,也可以跟处理器1501集成在一起。当所述存储器1502是独立于处理器1501之外的器件时,所述电子设备1500,还可以包括:
总线1503,用于连接所述处理器1501以及所述存储器1502。总线可以是工业标准体系结构(industry standard architecture,简称为ISA)总线、外部设备互连(peripheral component,PCI)总线或扩展工业标准体系结构(extended industry standard architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等,但并不表示仅有一根总线或一种类型的总线。
可选的,在具体实现上,如果存储器1502和处理器1501集成在一块芯片上实现,则存储器1502和处理器1501可以通过内部接口完成通信。
本申请还提供了一种计算机可读存储介质,该计算机可读存储介质可以包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁盘或者光盘等各种可以存储程序代码的介质,具体的,该计算机可读存储介质中存储有程序指令,程序指令用于上述各实施例中无线耳机端所对应的方法。
本申请还提供了一种计算机可读存储介质,该计算机可读存储介质可以包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁盘或者光盘等各种可以存储程序代码的介质,具体的,该计算机可读存储介质中存储有程序指令,程序指令用于上述各实施例中播放设备端所对应的方法。
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。

Claims (74)

  1. 一种音频处理方法,其特征在于,应用于无线耳机,所述方法包括:
    通过无线传输方式接收播放设备发送的待呈现音频信号,所述待呈现音频信号包括第一音频信号和/或第二音频信号,其中,所述第一音频信号为在所述播放设备渲染处理后的音频信号,所述第二音频信号为待渲染的音频信号;
    若所述待呈现音频信号包括所述第二音频信号,则对所述第二音频信号进行渲染处理,以获得第三音频信号;
    根据所述第一音频信号和/或所述第三音频信号进行后续音频播放。
  2. 根据权利要求1所述的音频处理方法,其特征在于,在所述通过无线传输方式接收播放设备发送的待呈现音频信号之前,包括:
    通过无线传输方式向所述播放设备发送指示信号,所述指示信号用于指示所述播放设备对原始音频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。
  3. 根据权利要求2所述的音频处理方法,其特征在于,在所述通过无线传输方式向所述播放设备发送指示信号之前,还包括:
    获取所述无线耳机的性能参数,并根据所述性能参数确定所述指示信号。
  4. 根据权利要求3所述的音频处理方法,其特征在于,在所述通过无线传输方式向所述播放设备发送指示信号之前,还包括:
    接收所述播放设备发送的音频特性信息,所述音频特性信息包括输入至所述播放设备的所述原始音频信号的特性参数,所述特性参数包括:码流格式、声道参数、对象参数以及场景成分参数中的至少一种。
  5. 根据权利要求2-4中任意一项所述的音频处理方法,其特征在于,所述指示信号包括标识码;
    其中,若所述标识码为第一字段,则所述播放设备未对所述原始音频信号进行渲染,所述待呈现音频信号包括所述第二音频信号,未包括所述第一音频信号,所述无线耳机对所述原始音频信号进行全部渲染;
    若所述标识码为第二字段,则所述播放设备对所述原始音频信号进行全部渲染,所述待呈现音频信号包括所述第一音频信号,未包括所述第二 音频信号,所述无线耳机未对所述原始音频信号进行渲染;
    若所述标识码为第三字段,则所述播放设备对所述原始音频信号进行部分渲染,所述包括所述第一音频信号和所述第二音频信号,所述无线耳机对所述原始音频信号剩余部分进行渲染。
  6. 根据权利要求1-4中任意一项所述的音频处理方法,其特征在于,在所述通过无线传输方式接收播放设备发送的待呈现音频信号之后,还包括:
    对所述待呈现音频信号进行解码处理,以获得所述第一音频信号和/或所述第二音频信号。
  7. 根据权利要求1-4中任意一项所述的音频处理方法,其特征在于,所述对所述第二音频信号进行渲染处理,以获得第三音频信号,包括:
    根据渲染元数据对所述第二音频信号进行渲染处理,以获得所述第三音频信号,其中,所述渲染元数据包括第一元数据以及第二元数据,所述第一元数据为所述播放设备端的元数据,所述第二元数据为无线耳机端的元数据。
  8. 根据权利要求7所述的音频处理方法,其特征在于,所述第一元数据包括耳机传感器元数据,其中,所述耳机传感器元数据用于表征所述播放设备的运动特征;和/或,
    所述第二元数据包括播放设备传感器元数据以及头相关变换函数HRTF数据库,其中,所述播放设备传感器元数据用于表征所述无线耳机的运动特征。
  9. 根据权利要求8所述的音频处理方法,其特征在于,所述耳机传感器元数据通过耳机传感器获得,所述耳机传感器包括陀螺仪传感器、头部大小传感器、测距传感器、地磁传感器以及加速度传感器中的至少一种;和/或,
    所述播放设备传感器元数据通过播放设备传感器获得,所述播放设备传感器包括陀螺仪传感器、头部大小传感器、测距传感器、地磁传感器以及加速度传感器中的至少一种。
  10. 根据权利要求9所述的音频处理方法,其特征在于,所述无线耳机包括第一无线耳机以及第二无线耳机;
    所述第一无线耳机或所述第二无线耳机中设置有所述耳机传感器;或 者,
    所述第一无线耳机与所述第二无线耳机中均设置有所述耳机传感器,则在所述第一无线耳机与所述第二无线耳机分别获取到所述耳机传感器元数据之后,对所述耳机传感器元数据进行相互同步。
  11. 根据权利要求10所述的音频处理方法,其特征在于,所述第一无线耳机与所述第二无线耳机用于与所述播放设备建立无线连接;所述通过无线传输方式接收播放设备发送的待呈现音频信号,包括:
    所述第一无线耳机接收所述播放设备发送的第一待呈现音频信号,所述第二无线耳机接收所述播放设备发送的第二待呈现音频信号;
    对应的,在所述无线耳机中的渲染处理,包括:
    所述第一无线耳机对所述第一待呈现音频信号进行渲染处理,以获取第一播放音频信号,所述第二无线耳机对所述第二待呈现音频信号进行渲染处理,以获取第二播放音频信号;
    所述第一无线耳机播放所述第一播放音频信号,所述第二无线耳机播放所述第二播放音频信号。
  12. 根据权利要求11所述的音频处理方法,其特征在于,在所述第一无线耳机对所述第一待呈现音频信号进行渲染处理之前,还包括:
    所述第一无线耳机对所述第一待呈现音频信号进行解码处理,以获取第一解码音频信号;
    对应的,所述第一无线耳机对所述第一待呈现音频信号进行渲染处理,包括:
    所述第一无线耳机根据所述第一解码音频信号以及渲染元数据进行渲染处理,以获取所述第一播放音频信号;以及
    在所述第二无线耳机对所述第二待呈现音频信号进行渲染处理之前,还包括:
    所述第二无线耳机对所述第二待呈现音频信号进行解码处理,以获取第二解码音频信号;
    对应的,所述第二无线耳机对所述第二待呈现音频信号进行渲染处理,包括:
    所述第二无线耳机根据所述第二解码音频信号以及渲染元数据进行渲染处理,以获取所述第二播放音频信号。
  13. 根据权利要求12所述的音频处理方法,其特征在于,所述渲染元数据包括第一无线耳机元数据、第二无线耳机元数据以及播放设备元数据中的至少一种。
  14. 根据权利要求13所述的音频处理方法,其特征在于,所述第一无线耳机元数据包括第一耳机传感器元数据以及头相关变换函数HRTF数据库,其中,所述第一耳机传感器元数据用于表征所述第一无线耳机的运动特征;
    所述第二无线耳机元数据包括第二耳机传感器元数据以及头相关变换函数HRTF数据库,其中,所述第二耳机传感器元数据用于表征所述第二无线耳机的运动特征;
    所述播放设备元数据包括播放设备传感器元数据,其中,所述播放设备传感器元数据用于表征所述播放设备的运动特征。
  15. 根据权利要求14所述的音频处理方法,其特征在于,在进行所述渲染处理之前,还包括:
    所述第一无线耳机与所述第二无线耳机同步所述渲染元数据。
  16. 根据权利要求15所述的音频处理方法,其特征在于,若所述第一无线耳机上设置有耳机传感器,所述第二无线耳机上未设置有耳机传感器,所述播放设备上未设置有播放设备传感器,则所述第一无线耳机与所述第二无线耳机同步所述渲染元数据,包括:
    所述第一无线耳机将所述第一耳机传感器元数据发送至所述第二无线耳机,所述第二无线耳机将所述第一耳机传感器元数据作为所述第二耳机传感器元数据。
  17. 根据权利要求15所述的音频处理方法,其特征在于,若所述第一无线耳机与所述第二无线耳机上均设置有耳机传感器,所述播放设备上未设置有播放设备传感器,则所述第一无线耳机与所述第二无线耳机同步所述渲染元数据,包括:
    所述第一无线耳机将所述第一耳机传感器元数据发送至所述第二无线耳机,所述第二无线耳机将所述第二耳机传感器元数据发送至所述第一无线耳机;
    所述第一无线耳机与所述第二无线耳机分别根据所述第一耳机传感器元数据、所述第二耳机传感器元数据以及预设数值算法确定所述渲染元数 据;或者,
    所述第一无线耳机将所述第一耳机传感器元数据发送至所述播放设备,所述第二无线耳机将所述第二耳机传感器元数据发送至所述播放设备,以使所述播放设备根据所述第一耳机传感器元数据、所述第二耳机传感器元数据以及预设数值算法确定所述渲染元数据;
    所述第一无线耳机与所述第二无线耳机分别接收所述渲染元数据。
  18. 根据权利要求15所述的音频处理方法,其特征在于,若所述第一无线耳机上设置有耳机传感器,所述第二无线耳机上未设置有耳机传感器,所述播放设备上设置有播放设备传感器,则所述第一无线耳机与所述第二无线耳机同步所述渲染元数据,包括:
    所述第一无线耳机将所述第一耳机传感器元数据发送至所述播放设备,以使播放设备根据所述第一耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据;
    所述第一无线耳机与所述第二无线耳机分别接收所述渲染元数据;或者,
    所述第一无线耳机接收所述播放设备发送的播放设备传感器元数据;
    所述第一无线耳机根据所述第一耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据;
    所述第一无线耳机将所述渲染元数据发送至所述第二无线耳机。
  19. 根据权利要求15所述的音频处理方法,其特征在于,若所述第一无线耳机与所述第二无线耳机上均设置有耳机传感器,所述播放设备上设置有播放设备传感器,则所述第一无线耳机与所述第二无线耳机同步所述渲染元数据,包括:
    所述第一无线耳机将所述第一耳机传感器元数据发送至所述播放设备,所述第二无线耳机将所述第二耳机传感器元数据发送至所述播放设备,以使所述播放设备根据所述第一耳机传感器元数据、所述第二耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据;
    所述第一无线耳机与所述第二无线耳机分别接收所述渲染元数据;或者,
    所述第一无线耳机将所述第一耳机传感器元数据发送至所述第二无线耳机,所述第二无线耳机将所述第二耳机传感器元数据发送至所述第一无 线耳机;
    所述第一无线耳机与所述第二无线耳机分别接收所述播放设备传感器元数据;
    所述第一无线耳机以及所述第二无线耳机分别根据所述第一耳机传感器元数据、所述第二耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据。
  20. 根据权利要求1-4中任意一项所述的音频处理方法,其特征在于,所述待呈现音频信号包括基于声道的音频信号、基于对象的音频信号、基于场景的音频信号中的至少一种。
  21. 根据权利要求1-4中任意一项所述的音频处理方法,其特征在于,所述渲染处理包括:双耳虚拟渲染、声道信号渲染、对象信号渲染以及场景信号渲染中的至少一种。
  22. 根据权利要求1-4中任意一项所述的音频处理方法,其特征在于,所述无线传输方式包括:蓝牙通信、红外线通信、WIFI通信、LIFI可见光通信。
  23. 一种音频处理方法,其特征在于,应用于播放设备,所述方法包括:
    获取原始音频信号,并根据所述原始音频信号生成待呈现音频信号,所述待呈现音频信号包括第一音频信号和/或第二音频信号,其中,所述第一音频信号为在所述播放设备渲染处理后的音频信号,所述第二音频信号为待渲染的音频信号;
    通过无线传输方式向无线耳机发送的所述待呈现音频信号。
  24. 根据权利要求23所述的音频处理方法,其特征在于,在所述通过无线传输方式向无线耳机发送的待呈现音频信号之前,包括:
    通过所述无线传输方式接收所述无线耳机发送的指示信号,所述指示信号用于指示所述播放设备对所述原始音频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。
  25. 根据权利要求23所述的音频处理方法,其特征在于,在所述通过无线传输方式向无线耳机发送的待呈现音频信号之前,还包括:
    通过所述无线传输方式接收所述无线耳机的性能参数,并根据所述性能参数确定指示信号,所述指示信号用于指示所述播放设备对所述原始音 频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。
  26. 根据权利要求25所述的音频处理方法,其特征在于,所述通过所述无线传输方式接收所述无线耳机的性能参数,并根据所述性能参数确定所述指示信号,包括:
    获取所述原始音频信号的特性参数,所述特性参数包括:码流格式、声道参数、对象参数以及场景成分参数中的至少一种;
    根据所述特性参数以及所述性能参数确定所述指示信号。
  27. 根据权利要求24-26中任意一项所述的音频处理方法,其特征在于,所述指示信号包括标识码;
    其中,若所述标识码为第一字段,则所述播放设备未对所述原始音频信号进行渲染,则所述待呈现音频信号包括所述第二音频信号,未包括所述第一音频信号,所述无线耳机对所述原始音频信号进行全部渲染;
    若所述标识码为第二字段,则所述播放设备对所述原始音频信号进行全部渲染,则所述待呈现音频信号包括所述第一音频信号,未包括所述第二音频信号,所述无线耳机未对所述原始音频信号进行渲染;
    若所述标识码为第三字段,则所述播放设备对所述原始音频信号进行部分渲染,则所述包括所述第一音频信号和所述第二音频信号,所述无线耳机对所述原始音频信号剩余部分进行渲染。
  28. 根据权利要求23-26中任意一项所述的音频处理方法,其特征在于,所述原始音频信号包括第四音频信号和/或第五音频信号,其中,所述第四音频信号用于处理后生成所述第一音频信号,所述第五音频信号用于生成所述第二音频信号;
    对应的,在所述获取原始音频信号之后,还包括:
    对所述第四音频信号进行解码处理,以获得第六音频信号,所述第六音频信号包括第七音频信号和/或第八音频信号;
    对所述第七音频信号进行渲染处理,以获取第九音频信号;
    对所述第八音频信号以及所述第九音频信号进行编码,以获取第十音频信号,所述待呈现音频信号包括所述第五音频信号以及所述第十音频信号。
  29. 根据权利要求28所述的音频处理方法,其特征在于,所述对所述第七音频信号进行渲染处理,包括:
    根据渲染元数据对所述第七音频信号进行渲染处理,以获得所述第九音频信号,其中,所述渲染元数据包括第一元数据以及第二元数据,所述第一元数据为所述播放设备端的元数据,所述第二元数据为无线耳机端的元数据。
  30. 根据权利要求29所述的音频处理方法,其特征在于,所述第一元数据包括耳机传感器元数据,其中,所述耳机传感器元数据用于表征所述播放设备的运动特征;和/或,
    所述第二元数据包括播放设备传感器元数据以及头相关变换函数HRTF数据库,其中,所述传感器元数据用于表征所述无线耳机的运动特征。
  31. 根据权利要求30所述的音频处理方法,其特征在于,所述耳机传感器元数据通过耳机传感器获得,所述耳机传感器包括陀螺仪传感器、头部大小传感器、测距传感器、地磁传感器以及加速度传感器中的至少一种;和/或,
    所述播放设备传感器元数据通过播放设备传感器获得,所述播放设备传感器包括陀螺仪传感器、头部大小传感器、测距传感器、地磁传感器以及加速度传感器中的至少一种。
  32. 根据权利要求23-26中任意一项所述的音频处理方法,其特征在于,所述待呈现音频信号包括基于声道的音频信号、基于对象的音频信号、基于场景的音频信号中的至少一种。
  33. 根据权利要求23-26中任意一项所述的音频处理方法,其特征在于,所述渲染处理包括:双耳虚拟渲染、声道信号渲染、对象信号渲染以及场景信号渲染中的至少一种。
  34. 根据权利要求23-26中任意一项所述的音频处理方法,其特征在于,所述无线传输方式包括:蓝牙通信、红外线通信、WIFI通信、LIFI可见光通信。
  35. 一种音频处理装置,其特征在于,包括:
    获取模块,用于通过无线传输方式接收播放设备发送的待呈现音频信号,所述待呈现音频信号包括第一音频信号和/或第二音频信号,其中,所述第一音频信号为在所述播放设备渲染处理后的音频信号,所述第二音频信号为待渲染的音频信号;
    渲染模块,用于在所述待呈现音频信号包括所述第二音频信号时,则对所述第二音频信号进行渲染处理,以获得第三音频信号;
    播放模块,用于根据所述第一音频信号和/或所述第三音频信号进行后续音频播放。
  36. 根据权利要求35所述的音频处理装置,其特征在于,所述接收模块,用于通过无线传输方式接收播放设备发送的待呈现音频信号之前,还包括:
    发送模块,用于通过无线传输方式向所述播放设备发送指示信号,所述指示信号用于指示所述播放设备对原始音频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。
  37. 根据权利要求36所述的音频处理装置,其特征在于,所述发送模块,用于通过无线传输方式向所述播放设备发送指示信号之前,还包括:
    所述获取模块,还用于获取所述无线耳机的性能参数,并根据所述性能参数确定所述指示信号。
  38. 根据权利要求37所述的音频处理装置,其特征在于,所述发送模块,用于通过无线传输方式向所述播放设备发送指示信号之前,还包括:
    所述获取模块,还用于接收所述播放设备发送的音频特性信息,所述音频特性信息包括输入至所述播放设备的所述原始音频信号的特性参数,所述特性参数包括:码流格式、声道参数、对象参数以及场景成分参数中的至少一种。
  39. 根据权利要求36-38中任意一项所述的音频处理装置,其特征在于,所述指示信号包括标识码;
    其中,若所述标识码为第一字段,则所述播放设备未对所述原始音频信号进行渲染,所述待呈现音频信号包括所述第二音频信号,未包括所述第一音频信号,所述音频处理装置对所述原始音频信号进行全部渲染;
    若所述标识码为第二字段,则所述播放设备对所述原始音频信号进行全部渲染,所述待呈现音频信号包括所述第一音频信号,未包括所述第二音频信号,所述音频处理装置未对所述原始音频信号进行渲染;
    若所述标识码为第三字段,则所述播放设备对所述原始音频信号进行部分渲染,所述包括所述第一音频信号和所述第二音频信号,所述音频处理装置对所述原始音频信号剩余部分进行渲染。
  40. 根据权利要求35-38中任意一项所述的音频处理装置,其特征在于,在所述获取模块,用于通过无线传输方式接收播放设备发送的待呈现音频信号之后,还包括:
    解码模块,用于对所述待呈现音频信号进行解码处理,以获得所述第一音频信号和/或所述第二音频信号。
  41. 根据权利要求35-38中任意一项所述的音频处理装置,其特征在于,所述渲染模块,用于对所述第二音频信号进行渲染处理,以获得第三音频信号,包括:
    所述渲染模块,用于根据渲染元数据对所述第二音频信号进行渲染处理,以获得所述第三音频信号,其中,所述渲染元数据包括第一元数据以及第二元数据,所述第一元数据为所述播放设备端的元数据,所述第二元数据为无线耳机端的元数据。
  42. 根据权利要求41所述的音频处理装置,其特征在于,所述第一元数据包括第一传感模块元数据,其中,所述第一传感模块元数据用于表征所述播放设备的运动特征;和/或,
    所述第二元数据包括第二传感模块元数据以及头相关变换函数HRTF数据库,其中,所述第二传感模块元数据用于表征所述无线耳机的运动特征。
  43. 根据权利要求42所述的音频处理装置,其特征在于,所述耳机传感器元数据通过第一传感模块获得,所述第一传感模块包括陀螺仪传感子模块、头部大小传感子模块、测距传感子模块、地磁传感子模块以及加速度传感子模块中的至少一种;和/或,
    所述播放设备传感器元数据通过第二传感模块获得,所述第二传感模块包括陀螺仪传感子模块、头部大小传感子模块、测距传感子模块、地磁传感子模块以及加速度传感子模块中的至少一种。
  44. 根据权利要求43所述的音频处理装置,其特征在于,所述音频处理装置包括第一音频处理装置以及第二音频处理装置;
    所述第一音频处理装置或所述第二音频处理装置中设置有所述第二传感子模块;或者,
    所述第一音频处理装置与所述第二音频处理装置中均设置有所述第二传感子模块,在所述第一音频处理装置的获取模块与所述第二音频处理装 置的获取模块,用于获取到所述播放设备传感器元数据之后,还包括:
    同步模块,用于对所述播放设备传感器元数据进行相互同步。
  45. 根据权利要求44所述的音频处理装置,其特征在于,所述第一音频处理装置包括:
    第一接收模块,用于接收播放设备发送的第一待呈现音频信号;
    第一渲染模块,用于对所述第一待呈现音频信号进行渲染处理,以获取第一播放音频信号;
    第一播放模块,用于播放所述第一播放音频信号;
    所述第二音频处理装置包括:
    第二接收模块,用于接收所述播放设备发送的第二待呈现音频信号;
    第二渲染模块,用于对所述第二待呈现音频信号进行渲染处理,以获取第二播放音频信号;
    第二播放模块,用于播放所述第二播放音频信号。
  46. 根据权利要求45所述的音频处理装置,其特征在于,所述第一音频处理装置,还包括:
    第一解码模块,用于对所述第一待呈现音频信号进行解码处理,以获取第一解码音频信号;
    所述第一渲染模块,具体用于:根据所述第一解码音频信号以及渲染元数据进行渲染处理,以获取所述第一播放音频信号;
    所述第二音频处理装置,还包括:
    第二解码模块,用于对所述第二待呈现音频信号进行解码处理,以获取第二解码音频信号;
    所述第二渲染模块,具体用于:根据所述第二解码音频信号以及渲染元数据进行渲染处理,以获取所述第二播放音频信号。
  47. 根据权利要求46所述的音频处理装置,其特征在于,所述渲染元数据包括第一无线耳机元数据、第二无线耳机元数据以及播放设备元数据中的至少一种。
  48. 根据权利要求47所述的音频处理装置,其特征在于,所述第一无线耳机元数据包括第一耳机传感器元数据以及头相关变换函数HRTF数据库,其中,所述第一耳机传感器元数据用于表征所述第一无线耳机的运动特征;
    所述第二无线耳机元数据包括第二耳机传感器元数据以及头相关变换函数HRTF数据库,其中,所述第二耳机传感器元数据用于表征所述第二无线耳机的运动特征;
    所述播放设备元数据包括播放设备传感器元数据,其中,所述播放设备传感器元数据用于表征所述播放设备的运动特征。
  49. 根据权利要求48所述的音频处理装置,其特征在于,所述第一音频处理装置,还包括:
    第一同步模块,用于与所述第二无线耳机同步所述渲染元数据;和/或,
    所述第二音频处理装置,还包括:
    第二同步模块,用于与所述第一无线耳机同步所述渲染元数据。
  50. 根据权利要求49所述的音频处理装置,其特征在于,所述第一同步模块,具体用于:将所述第一耳机传感器元数据发送至所述第二无线耳机,以所属所述第二同步模块将所述第一耳机传感器元数据作为所述第二耳机传感器元数据。
  51. 根据权利要求49所述的音频处理装置,其特征在于,所述第一同步模块,具体用于:
    发送所述第一耳机传感器元数据;
    接收所述第二耳机传感器元数据;
    根据所述第一耳机传感器元数据、所述第二耳机传感器元数据以及预设数值算法确定所述渲染元数据;
    所述第二同步模块,具体用于:
    发送所述第二耳机传感器元数据;
    接收所述第一耳机传感器元数据;
    根据所述第一耳机传感器元数据、所述第二耳机传感器元数据以及预设数值算法确定所述渲染元数据;或者,
    所述第一同步模块,具体用于:
    发送所述第一耳机传感器元数据;
    接收所述渲染元数据;
    所述第二同步模块,具体用于:
    发送所述第二耳机传感器元数据;
    接收所述渲染元数据。
  52. 根据权利要求49所述的音频处理装置,其特征在于,所述第一同步模块,具体用于:
    接收播放设备传感器元数据;
    根据所述第一耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据;
    发送所述渲染元数据。
  53. 根据权利要求49所述的音频处理装置,其特征在于,所述第一同步模块,具体用于:
    发送所述第一耳机传感器元数据;
    接收所述第二耳机传感器元数据;
    接收所述播放设备传感器元数据;
    根据所述第一耳机传感器元数据、所述第二耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据;
    所述第二同步模块,具体用于:
    发送所述第二耳机传感器元数据;
    接收所述第一耳机传感器元数据;
    接收所述播放设备传感器元数据;
    根据所述第一耳机传感器元数据、所述第二耳机传感器元数据、所述播放设备传感器元数据以及预设数值算法确定所述渲染元数据。
  54. 根据权利要求35-38中任意一项所述的音频处理装置,其特征在于,所述待呈现音频信号包括基于声道的音频信号、基于对象的音频信号、基于场景的音频信号中的至少一种。
  55. 根据权利要求35-38中任意一项所述的音频处理装置,其特征在于,所述渲染处理包括:双耳虚拟渲染、声道信号渲染、对象信号渲染以及场景信号渲染中的至少一种。
  56. 根据权利要求35-38中任意一项所述的音频处理装置,其特征在于,所述无线传输方式包括:蓝牙通信、红外线通信、WIFI通信、LIFI可见光通信。
  57. 一种音频处理装置,其特征在于,包括:
    获取模块,用于接收原始音频信号,并根据所述原始音频信号生成待 呈现音频信号,所述待呈现音频信号包括第一音频信号和/或第二音频信号,其中,所述第一音频信号为在播放设备渲染处理后的音频信号,所述第二音频信号为待渲染的音频信号;
    发送模块,用于通过无线传输方式向无线耳机发送的待呈现音频信号。
  58. 根据权利要求57所述的音频处理装置,其特征在于,在所述发送模块,用于通过无线传输方式向无线耳机发送的待呈现音频信号之前,包括:
    所述获取模块,还用于通过所述无线传输方式接收所述无线耳机发送的指示信号,所述指示信号用于指示所述播放设备对所述原始音频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。
  59. 根据权利要求57所述的音频处理装置,其特征在于,在所述发送模块,用于通过无线传输方式向无线耳机发送的待呈现音频信号之前,还包括:
    所述获取模块,还用于通过所述无线传输方式接收所述无线耳机的性能参数,并根据所述性能参数确定指示信号,所述指示信号用于指示所述播放设备对所述原始音频信号按照对应的预设处理方式进行渲染,以获取所述待呈现音频信号。
  60. 根据权利要求59所述的音频处理装置,其特征在于,所述获取模块,还用于通过所述无线传输方式接收所述无线耳机的性能参数,并根据所述性能参数确定指示信号,包括:
    所述获取模块,还用于获取所述原始音频信号的特性参数,所述特性参数包括:码流格式、声道参数、对象参数以及场景成分参数中的至少一种;
    所述获取模块,还用于根据所述特性参数以及所述性能参数确定所述指示信号。
  61. 根据权利要求58-60中任意一项所述的音频处理装置,其特征在于,所述指示信号包括标识码;
    其中,若所述标识码为第一字段,则所述播放设备未对所述原始音频信号进行渲染,则所述待呈现音频信号包括所述第二音频信号,未包括所述第一音频信号,所述音频处理装置对所述原始音频信号进行全部渲染;
    若所述标识码为第二字段,则所述播放设备对所述原始音频信号进行 全部渲染,则所述待呈现音频信号包括所述第一音频信号,未包括所述第二音频信号,所述音频处理装置未对所述原始音频信号进行渲染;
    若所述标识码为第三字段,则所述播放设备对所述原始音频信号进行部分渲染,则所述包括所述第一音频信号和所述第二音频信号,所述音频处理装置对所述原始音频信号剩余部分进行渲染。
  62. 根据权利要求57-60中任意一项所述的音频处理装置,其特征在于,所述原始音频信号包括第四音频信号和/或第五音频信号,其中,所述第四音频信号用于处理后生成所述第一音频信号,所述第五音频信号用于生成所述第二音频信号;
    对应的,在所述获取模块,用于获取原始音频信号之后,还包括:
    解码模块,用于对所述第四音频信号进行解码处理,以获得第六音频信号,所述第六音频信号包括第七音频信号和/或第八音频信号;
    渲染模块,用于对所述第七音频信号进行渲染处理,以获取第九音频信号;
    编码模块,用于对所述第八音频信号以及所述第九音频信号进行编码,以获取第十音频信号,所述待呈现音频信号包括所述第五音频信号以及所述第十音频信号。
  63. 根据权利要求62所述的音频处理装置,其特征在于,所述渲染模块,用于对所述第七音频信号进行渲染处理,包括:
    所述渲染模块,用于根据渲染元数据对所述第七音频信号进行渲染处理,以获得所述第九音频信号,其中,所述渲染元数据包括第一元数据以及第二元数据,所述第一元数据为所述播放设备端的元数据,所述第二元数据为无线耳机端的元数据。
  64. 根据权利要求63所述的音频处理装置,其特征在于,所述第一元数据包括第一传感子模块元数据,其中,所述第一传感子模块元数据用于表征所述播放设备的运动特征;和/或,
    所述第二元数据包括第二传感子模块元数据以及头相关变换函数HRTF数据库,其中,所述传感子模块元数据用于表征所述无线耳机的运动特征。
  65. 根据权利要求64所述的音频处理装置,其特征在于,所述第一传感子模块元数据通过第一传感子模块获得,所述第一传感子模块包括陀螺 仪传感子模块、头部大小传感子模块、测距传感子模块、地磁传感子模块以及加速度传感子模块中的至少一种;和/或,
    所述第二传感子模块元数据通过第二传感子模块获得,所述第二传感子模块包括陀螺仪传感子模块、头部大小传感子模块、测距传感子模块、地磁传感子模块以及加速度传感子模块中的至少一种。
  66. 根据权利要求57-60中任意一项所述的音频处理装置,其特征在于,所述待呈现音频信号包括基于声道的音频信号、基于对象的音频信号、基于场景的音频信号中的至少一种。
  67. 根据权利要求57-60中任意一项所述的音频处理装置,其特征在于,所述渲染处理包括:双耳虚拟渲染、声道信号渲染、对象信号渲染以及场景信号渲染中的至少一种。
  68. 根据权利要求57-60中任意一项所述的音频处理装置,其特征在于,所述无线传输方式包括:蓝牙通信、红外线通信、WIFI通信、LIFI可见光通信。
  69. 一种音频处理系统,其特征在于,包括:如权利要求35所述的音频处理装置以及如权利要求57所述的音频处理装置。
  70. 一种无线耳机,其特征在于,包括:
    处理器;以及
    存储器,用于存储所述处理器的计算机程序;
    其中,所述处理器被配置为通过执行所述计算机程序来实现权利要求1-22中任意一项所述的音频处理方法。
  71. 一种播放设备,其特征在于,包括:
    处理器;以及
    存储器,用于存储所述处理器的计算机程序;
    其中,所述处理器被配置为通过执行所述计算机程序来实现权利要求23-34中任意一项所述的音频处理方法。
  72. 一种音频处理系统,其特征在于,包括:如权利要求70所述的无线耳机以及如权利要求71所述的播放设备。
  73. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1-22中任意一项所述的音频处理方法。
  74. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求23-34中任意一项所述的音频处理方法。
PCT/CN2021/081459 2020-07-31 2021-03-18 音频处理方法、装置、系统以及存储介质 WO2022021898A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21850364.7A EP4171066A4 (en) 2020-07-31 2021-03-18 AUDIO PROCESSING METHOD, APPARATUS AND SYSTEM, AND STORAGE MEDIUM
US18/156,579 US20230156403A1 (en) 2020-07-31 2023-01-19 Audio processing method, apparatus, system, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010762076.3 2020-07-31
CN202010762076.3A CN111918177A (zh) 2020-07-31 2020-07-31 音频处理方法、装置、系统以及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/156,579 Continuation US20230156403A1 (en) 2020-07-31 2023-01-19 Audio processing method, apparatus, system, and storage medium

Publications (1)

Publication Number Publication Date
WO2022021898A1 true WO2022021898A1 (zh) 2022-02-03

Family

ID=73288203

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/081459 WO2022021898A1 (zh) 2020-07-31 2021-03-18 音频处理方法、装置、系统以及存储介质

Country Status (4)

Country Link
US (1) US20230156403A1 (zh)
EP (1) EP4171066A4 (zh)
CN (1) CN111918177A (zh)
WO (1) WO2022021898A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111918177A (zh) * 2020-07-31 2020-11-10 北京全景声信息科技有限公司 音频处理方法、装置、系统以及存储介质
CN113938652B (zh) * 2021-10-12 2022-07-26 深圳蓝集科技有限公司 一种无线图像传输系统
CN114173256B (zh) * 2021-12-10 2024-04-19 中国电影科学技术研究所 一种还原声场空间及姿态追踪的方法、装置和设备
TWI805215B (zh) * 2022-02-09 2023-06-11 美律實業股份有限公司 真無線耳機系統及耳機同步方法
CN117061935B (zh) * 2023-10-11 2024-04-05 中国民用航空飞行学院 一种无线播音装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180091920A1 (en) * 2016-09-23 2018-03-29 Apple Inc. Producing Headphone Driver Signals in a Digital Audio Signal Processing Binaural Rendering Environment
WO2019152783A1 (en) * 2018-02-01 2019-08-08 Qualcomm Incorporated Scalable unified audio renderer
CN110825338A (zh) * 2018-08-07 2020-02-21 大北欧听力公司 音频渲染系统
CN111194561A (zh) * 2017-09-27 2020-05-22 苹果公司 预测性的头部跟踪的双耳音频渲染
CN111918177A (zh) * 2020-07-31 2020-11-10 北京全景声信息科技有限公司 音频处理方法、装置、系统以及存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102433613B1 (ko) * 2014-12-04 2022-08-19 가우디오랩 주식회사 개인 특징을 반영한 바이노럴 오디오 신호 처리 방법 및 장치
US10598506B2 (en) * 2016-09-12 2020-03-24 Bragi GmbH Audio navigation using short range bilateral earpieces
US11259108B2 (en) * 2018-05-24 2022-02-22 Sony Corporation Information processing device and information processing method
EP3668123B1 (en) * 2018-12-13 2024-07-17 GN Audio A/S Hearing device providing virtual sound
CN111246331A (zh) * 2020-01-10 2020-06-05 北京塞宾科技有限公司 一种无线全景声混音耳机

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180091920A1 (en) * 2016-09-23 2018-03-29 Apple Inc. Producing Headphone Driver Signals in a Digital Audio Signal Processing Binaural Rendering Environment
CN111194561A (zh) * 2017-09-27 2020-05-22 苹果公司 预测性的头部跟踪的双耳音频渲染
WO2019152783A1 (en) * 2018-02-01 2019-08-08 Qualcomm Incorporated Scalable unified audio renderer
CN110825338A (zh) * 2018-08-07 2020-02-21 大北欧听力公司 音频渲染系统
CN111918177A (zh) * 2020-07-31 2020-11-10 北京全景声信息科技有限公司 音频处理方法、装置、系统以及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4171066A4 *

Also Published As

Publication number Publication date
US20230156403A1 (en) 2023-05-18
EP4171066A4 (en) 2023-12-27
EP4171066A1 (en) 2023-04-26
CN111918177A (zh) 2020-11-10

Similar Documents

Publication Publication Date Title
WO2022021898A1 (zh) 音频处理方法、装置、系统以及存储介质
CN110651487B (zh) 分布式音频虚拟化系统
JP2019518373A (ja) 没入型オーディオ再生システム
WO2022021899A1 (zh) 音频处理方法、装置、无线耳机以及存储介质
US10129682B2 (en) Method and apparatus to provide a virtualized audio file
CN105353868B (zh) 一种信息处理方法及电子设备
JP2014072894A (ja) カメラによるオーディオ空間化
US20140133658A1 (en) Method and apparatus for providing 3d audio
CN114731483A (zh) 用于虚拟现实音频的声场适配
CN114424587A (zh) 控制音频数据的呈现
WO2021003355A1 (en) Audio capture and rendering for extended reality experiences
US11558707B2 (en) Sound field adjustment
CN114067810A (zh) 音频信号渲染方法和装置
JP7483852B2 (ja) 不一致視聴覚捕捉システム
US11937069B2 (en) Audio system, audio reproduction apparatus, server apparatus, audio reproduction method, and audio reproduction program
WO2021170903A1 (en) Audio representation and associated rendering
US11729570B2 (en) Spatial audio monauralization via data exchange
WO2022262758A1 (zh) 音频渲染系统、方法和电子设备
WO2022262750A1 (zh) 音频渲染系统、方法和电子设备
US20240259731A1 (en) Artificial reverberation in spatial audio
CN111508507B (zh) 一种音频信号处理方法及装置
CN116634348A (zh) 头戴式可穿戴装置、音频信息的处理方法及存储介质
KR20240013110A (ko) 미디어 패킷들을 통한 모션 데이터 전달
CN116195276A (zh) 控制音频数据的渲染

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21850364

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021850364

Country of ref document: EP

Effective date: 20230120

NENP Non-entry into the national phase

Ref country code: DE