WO2023240467A1 - 音频播放方法、装置及存储介质 - Google Patents

音频播放方法、装置及存储介质 Download PDF

Info

Publication number
WO2023240467A1
WO2023240467A1 PCT/CN2022/098751 CN2022098751W WO2023240467A1 WO 2023240467 A1 WO2023240467 A1 WO 2023240467A1 CN 2022098751 W CN2022098751 W CN 2022098751W WO 2023240467 A1 WO2023240467 A1 WO 2023240467A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
audio data
control information
type
virtual speaker
Prior art date
Application number
PCT/CN2022/098751
Other languages
English (en)
French (fr)
Inventor
吕柱良
史润宇
吕雪洋
刘晗宇
刘念
Original Assignee
北京小米移动软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京小米移动软件有限公司 filed Critical 北京小米移动软件有限公司
Priority to PCT/CN2022/098751 priority Critical patent/WO2023240467A1/zh
Priority to CN202280004311.8A priority patent/CN117597945A/zh
Publication of WO2023240467A1 publication Critical patent/WO2023240467A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S5/00Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control

Definitions

  • the present disclosure relates to the field of audio technology, and in particular, to an audio playback method, device and storage medium.
  • the present disclosure provides an audio playback method, device and storage medium.
  • an audio playback method is provided, applied to a terminal, including:
  • the audio data is played according to the sound field control information and the virtual speaker distribution information.
  • an audio playback device applied to a terminal, including:
  • a type determination module configured to determine the audio type corresponding to the audio data played by the terminal
  • An information determination module configured to determine audio control information corresponding to the audio data according to the audio type, where the audio control information includes sound field control information and virtual speaker distribution information corresponding to a plurality of virtual speakers;
  • a playback module configured to play the audio data according to the sound field control information and the virtual speaker distribution information.
  • an audio playback device including:
  • Memory used to store instructions executable by the processor
  • the processor is configured as:
  • the audio data is played according to the sound field control information and the virtual speaker distribution information.
  • a computer-readable storage medium on which computer program instructions are stored.
  • the program instructions are executed by a processor, the steps of the audio playback method provided by the first aspect of the present disclosure are implemented.
  • the technical solution provided by the embodiments of the present disclosure may include the following beneficial effects: by determining the audio type corresponding to the audio data played by the terminal; and determining the audio control information corresponding to the audio data according to the audio type.
  • the information includes sound field control information and virtual speaker distribution information corresponding to multiple virtual speakers; the audio data is played according to the sound field control information and the virtual speaker distribution information. That is to say, the present disclosure can determine the audio control information corresponding to the audio data according to the audio type corresponding to the audio data.
  • the audio control information includes sound field control information and virtual speaker distribution information corresponding to multiple virtual speakers. In this way, the audio control information corresponding to the audio data can be determined according to different
  • the virtual sound field plays audio data of different audio types, making the spatial perception quality of the played audio data higher, thus improving the effect of 3D sound effects.
  • Figure 1 is a flow chart of an audio playback method according to an exemplary embodiment
  • Figure 2 is a virtual speaker distribution diagram according to an exemplary embodiment
  • Figure 3 is another virtual speaker distribution diagram according to an exemplary embodiment
  • Figure 4 is a flow chart of another audio playback method according to an exemplary embodiment
  • Figure 5 is a flow chart of another audio playback method according to an exemplary embodiment
  • Figure 6 is a flow chart of another audio playback method according to an exemplary embodiment
  • Figure 7 is a flow chart of another audio playback method according to an exemplary embodiment
  • Figure 8 is a flow chart of another audio playback method according to an exemplary embodiment
  • Figure 9 is a block diagram of an audio playback device according to an exemplary embodiment
  • Figure 10 is a block diagram of another audio playback device according to an exemplary embodiment
  • Figure 11 is a block diagram of another audio playback device according to an exemplary embodiment
  • Figure 12 is a block diagram of another audio playback device according to an exemplary embodiment
  • Figure 13 is a block diagram of a device according to an exemplary embodiment.
  • plural refers to two or more than two, and other quantifiers are similar; "at least one of the following” or similar expressions refers to these Any combination of items, including any combination of single items (items) or plural items (items).
  • at least one of a, b, or c can mean: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c can be single or multiple ;
  • “And/or” is an association relationship that describes related objects, indicating that there can be three kinds of relationships.
  • a and/or B can mean: A alone exists, A and B exist simultaneously, and B alone exists. situation, where A and B can be singular or plural.
  • Spatial audio technology refers to a technology that plays back sound through headphones or speakers and enables the listener to perceive the spatial properties of the sound.
  • multiple sets of head-related transfer functions are used to simulate multiple virtual speakers, and each virtual speaker plays back traditional multi-channel sound sources, so that the listener can perceive multiple virtual sound sources and achieve basic 3D sound effects.
  • the position of the virtual sound source generated in this way is fixed, resulting in poor 3D sound effects.
  • the present disclosure provides an audio playback method, device and storage medium, which can determine the audio control information corresponding to the audio data according to the audio type corresponding to the audio data.
  • the audio control information includes sound field control information and a plurality of
  • the virtual speaker distribution information corresponding to the virtual speaker can play different audio types of audio data according to different virtual sound fields, making the spatial perception quality of the played audio data higher, thus improving the effect of 3D sound effects.
  • Figure 1 is a flow chart of an audio playback method according to an exemplary embodiment. As shown in Figure 1, the method is used in a terminal and may include:
  • the audio data may be each frame of data in the target audio played by the terminal, and the target audio may be audio in any format.
  • the audio type may include music, movies, games, voice, etc., which is not limited in this disclosure.
  • the audio type corresponding to the audio data can be determined based on the path information or audio information corresponding to the audio data, or the audio data can be processed in the time-frequency domain to obtain the audio type corresponding to the audio data.
  • This disclosure The specific method of determining the audio type is not limited.
  • the audio control information may include sound field control information and virtual speaker distribution information corresponding to multiple virtual speakers.
  • the sound field control information may include sound effect control strategies, such as low-frequency enhancement, upmixing, downmixing, etc.
  • the virtual speaker distribution information may Including the number of virtual speakers, position information and direction information of each virtual speaker.
  • different audio control information can be preset through experiments. After determining the audio type corresponding to the audio data, the audio type corresponding to the audio type can be determined from the plurality of preset audio control information. Audio control information.
  • Figure 2 is a virtual speaker distribution diagram according to an exemplary embodiment. As shown in Figure 2, it includes 5 virtual speakers, and the dot in the center is the virtual speaker using the At the position of the user's head of the terminal, the angle between virtual speaker 1 and virtual speaker 2 is 30 degrees, the angle between virtual speaker 2 and virtual speaker 3 is 30 degrees, and the angle between virtual speaker 3 and virtual speaker 5 is 30 degrees. The included angle is 90 degrees, and the included angle between virtual speaker 1 and virtual speaker 4 is 90 degrees.
  • Figure 3 is another virtual speaker distribution diagram according to an exemplary embodiment. As shown in Figure 3, it also includes 5 virtual speakers.
  • the dot in the center is the terminal that uses the terminal.
  • the angle between virtual speaker 1 and virtual speaker 2 is 60 degrees
  • the angle between virtual speaker 2 and virtual speaker 3 is 60 degrees
  • the angle between virtual speaker 3 and virtual speaker 5 is 60 degrees.
  • the included angle is 50 degrees
  • the included angle between virtual speaker 1 and virtual speaker 4 is 50 degrees.
  • virtual speaker distribution map is just an example.
  • the virtual speaker distribution map may be different for different terminals and users, and the present disclosure does not limit this.
  • audio processing can be performed on the audio data according to the sound field control information, and then, according to the virtual speaker distribution information, based on the head-related transfer function of the prior art Perform binaural rendering and play this audio data.
  • the audio control information corresponding to the audio data can be determined according to the audio type corresponding to the audio data.
  • the audio control information includes sound field control information and virtual speaker distribution information corresponding to multiple virtual speakers. In this way, the audio control information corresponding to the audio data can be determined according to different virtual speakers.
  • the sound field plays audio data of different audio types, making the spatial perception quality of the played audio data higher, thereby improving the effect of 3D sound effects.
  • FIG. 4 is a flow chart of another audio playback method according to an exemplary embodiment. As shown in Figure 4, the implementation of step S101 may include:
  • S1011 Determine the first audio type corresponding to the audio data according to the path information corresponding to the audio data.
  • the channel information corresponding to the audio data can be determined first.
  • the channel information can include a music channel, a video channel, a notification channel, a voice channel, etc., and then the channel information can be determined according to the channel.
  • Information plays the audio data. Since audio data of different audio types use different paths, by playing the path information of the audio data, the first audio type corresponding to the audio data can be determined. For example, if the channel information corresponding to the audio data is a music channel, it may be determined that the first audio type corresponding to the audio data is music.
  • the audio information may include sampling rate, channel information and other meta-information related to the properties of the audio data itself.
  • the second audio type corresponding to the audio data can be determined, and the type association relationship can include corresponding relationships between different audio information and audio types. For example, if the audio information is a sampling rate and the sampling rate corresponding to the audio data is 44.1 kHz, it can be determined that the audio type is music. If the audio information is channel information and the channel information corresponding to the audio data is monophonic, it can be determined that the second audio type corresponding to the audio data is speech. It should be noted that the above-mentioned method of determining the second audio type is only an example, and the disclosure does not limit this.
  • the audio data can be analyzed and processed in the time-frequency domain through the existing time-frequency domain analysis algorithm to determine the third audio data corresponding to the audio data, which will not be described again here.
  • the first preset weight, the second preset weight, and the third preset weight can be obtained,
  • the audio type corresponding to the audio data is determined according to the first audio type, the second audio type, the third audio type, the first preset weight, the second preset weight and the third preset weight.
  • the first preset weight, the second preset weight and the third preset weight can be preset based on experience. For example, the first preset weight can be 0.5, and the second preset weight can be 0.3, The third preset weight may be 0.2.
  • the first preset weight is 0.5
  • the second preset weight is 0.3
  • the third preset weight is 0.5.
  • the weight is 0.2
  • the first default weight is 0.4
  • the second default weight is 0.3
  • the third default weight is 0.3
  • the above method determines the first audio type, the second audio type and the third audio type corresponding to the audio data in different ways, and then determines the audio data corresponding to the first audio type, the second audio type and the third audio type.
  • the audio type makes the final audio type more accurate, further improving the effect of 3D sound effects.
  • step S102 can be implemented in the following manner:
  • the audio control information corresponding to the audio type is determined through the preset control information association, and the control information association includes the correspondence between different audio types and audio control information.
  • the present disclosure can predetermine the audio control information corresponding to the audio type through experiments to obtain the control information association relationship.
  • Figure 5 is a flow chart of another audio playback method according to an exemplary embodiment. As shown in Figure 5, the method may further include:
  • the audio channel information may be mono channel, dual channel, 5.1 channel, etc., which is not limited in this disclosure.
  • step S102 may be:
  • the audio control information corresponding to the audio data is determined.
  • the audio control information may include sound field control information and virtual speaker distribution information corresponding to multiple virtual speakers.
  • the sound field control information may include sound effect control strategies, such as low-frequency enhancement, upmixing, downmixing, etc.
  • the virtual speaker distribution information may Including the number of virtual speakers, position information and direction information of each virtual speaker.
  • the virtual speaker distribution information corresponding to the audio data can be determined first according to the audio type, and then the channel information corresponding to the audio data can be determined. According to the channel information and the virtual speaker Speaker distribution information determines the sound field control information corresponding to the audio data. For example, if the audio channel information is two-channel and the number of virtual speakers is 5, it can be determined that the sound field control information is to upmix the two-channel into five channels; if the audio channel information is 7.1 channels , the number of virtual speakers is 5, then it can be determined that the sound field control information is to downmix 7.1 channels into 5 channels.
  • FIG. 6 is a flow chart of another audio playback method according to an exemplary embodiment. As shown in Figure 6, the implementation of step S103 may include:
  • the audio data can be upmixed using existing technology methods to obtain target audio data of five channels; if the sound field control information The information is to downmix the 7.1-channel sound into 5-channel sound, and then the audio data can be downmixed using the existing technology method to obtain the 5-channel target audio data.
  • the data of different channels of the target audio data can be transmitted to the corresponding virtual speakers according to the virtual speaker distribution information, and binaural rendering is performed based on the existing head-related transfer function, through multiple The virtual speaker plays the target audio data.
  • the head position information of the user using the terminal can be determined, and each virtual speaker can be determined based on the head position information. direction information. Afterwards, the target audio data is played according to the virtual speaker distribution information and the direction information of each virtual speaker.
  • the head position information of the user using the terminal can be determined by methods in the prior art, which will not be described again here.
  • the head position information can be coordinate information in the world coordinate system. After the head position information is determined, for each virtual speaker, the direction facing the head position information can be used as the direction information of the virtual speaker. Afterwards, binaural rendering can be performed based on the existing head-related transfer function according to the position information and direction information of each virtual speaker, and the target audio data can be played through multiple virtual speakers.
  • Figure 7 is a flow chart of another audio playback method according to an exemplary embodiment. As shown in Figure 7, the method may further include:
  • step S103 may be:
  • the audio data is played according to the sound field control information and the virtual speaker distribution information.
  • the user can turn on or off the sound field adjustment mode through the sound field adjustment menu in the setting module of the terminal.
  • the sound field adjustment of the terminal Before playing the audio data according to the sound field control information and the virtual speaker distribution information, the sound field adjustment of the terminal can be determined first. Whether the mode is turned on. If the sound field adjustment mode of the terminal is turned on, the audio data can be played according to the sound field control information and the virtual speaker distribution information; if the sound field adjustment mode of the terminal is not turned on, the sound field of the audio data will not be adjusted. , playing the audio data through the speakers installed on the terminal. In this way, users can flexibly choose whether to adjust the sound field, which improves user experience.
  • Figure 8 is a flow chart of another audio playback method according to an exemplary embodiment. As shown in Figure 8, the method may include:
  • the audio data may be each frame of data in the target audio played by the terminal, and the target audio may be audio in any format.
  • the audio type may include music, movies, games, voice, etc., which is not limited in this disclosure.
  • S804 Determine the audio type corresponding to the audio data according to the first audio type, the second audio type, and the third audio type.
  • the first preset weight, the second preset weight and the third preset weight can be obtained.
  • the first audio type, the second audio type, the third audio type, the first preset weight , the second preset weight and the third preset weight determine the audio type corresponding to the audio data.
  • S806 Determine the audio control information corresponding to the audio data according to the audio type and the channel information.
  • the audio control information may include sound field control information and virtual speaker distribution information corresponding to multiple virtual speakers.
  • the sound field control information may include sound effect control strategies, such as low-frequency enhancement, upmixing, downmixing, etc.
  • the virtual speaker distribution information may Including the number of virtual speakers, position information and direction information of each virtual speaker.
  • S807 Perform audio processing on the audio data according to the sound field control information to obtain target audio data.
  • S808 Play the target audio data according to the virtual speaker distribution information.
  • the head position information of the user using the terminal can be determined, and the direction information of each virtual speaker can be determined based on the head position information. After that, the head position information of the virtual speaker can be determined according to the head position information.
  • the distribution information and the direction information of each virtual speaker are used to play the target audio data.
  • audio data of different audio types can be played according to different virtual sound fields, so that the spatial perception quality of the played audio data is higher, thereby improving the effect of 3D sound effects.
  • Figure 9 is a block diagram of an audio playback device according to an exemplary embodiment. As shown in Figure 9, the device may include:
  • the type determination module 901 is configured to determine the audio type corresponding to the audio data played by the terminal;
  • the information determination module 902 is configured to determine audio control information corresponding to the audio data according to the audio type, where the audio control information includes sound field control information and virtual speaker distribution information corresponding to multiple virtual speakers;
  • the play module 903 is configured to play the audio data according to the sound field control information and the virtual speaker distribution information.
  • the type determination module 901 is also configured to:
  • the audio type corresponding to the audio data is determined according to the first audio type, the second audio type and the third audio type.
  • the type determination module 901 is also configured to:
  • the audio type corresponding to the audio data is determined according to the first audio type, the second audio type, the third audio type, the first preset weight, the second preset weight and the third preset weight.
  • the information determination module 902 is also configured to:
  • the audio control information corresponding to the audio type is determined through the preset control information association.
  • the control information association includes the correspondence between different audio types and audio control information.
  • Figure 10 is a block diagram of another audio playback device according to an exemplary embodiment. As shown in Figure 10, the device further includes:
  • the vocal channel determination module 904 is configured to determine the vocal channel information corresponding to the audio data
  • the information determination module 902 is also configured as:
  • the audio control information corresponding to the audio data is determined.
  • the playback module 903 is also configured to:
  • the sound field control information perform audio processing on the audio data to obtain target audio data
  • the target audio data is played.
  • Figure 11 is a block diagram of another audio playback device according to an exemplary embodiment. As shown in Figure 11, the device further includes:
  • the position determination module 905 is configured to determine the head position information of the user using the terminal;
  • the direction determination module 906 is configured to determine the direction information of each virtual speaker based on the head position information
  • the playback module 903 is also configured as:
  • the target audio data is played according to the virtual speaker distribution information and the direction information of each virtual speaker.
  • Figure 12 is a block diagram of another audio playback device according to an exemplary embodiment. As shown in Figure 12, the device further includes:
  • the mode determination module 907 is configured to determine whether the terminal turns on the sound field adjustment mode
  • the playback module 903 is also configured as:
  • the audio data is played according to the sound field control information and the virtual speaker distribution information.
  • the audio control information corresponding to the audio data can be determined according to the audio type corresponding to the audio data.
  • the audio control information includes sound field control information and virtual speaker distribution information corresponding to multiple virtual speakers. In this way, the audio control information corresponding to the audio data can be determined according to different virtual speakers.
  • the sound field plays audio data of different audio types, making the spatial perception quality of the played audio data higher, thereby improving the effect of 3D sound effects.
  • the present disclosure also provides a computer-readable storage medium on which computer program instructions are stored. When the program instructions are executed by a processor, the steps of the audio playback method provided by the present disclosure are implemented.
  • FIG. 13 is a block diagram of a device 800 according to an exemplary embodiment.
  • the device 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, or the like.
  • apparatus 800 may include one or more of the following components: processing component 802, memory 804, power supply component 806, multimedia component 808, audio component 810, input/output interface 812, sensor component 814, and communication component 816.
  • Processing component 802 generally controls the overall operations of device 800, such as operations associated with display, phone calls, data communications, camera operations, and recording operations.
  • the processing component 802 may include one or more processors 820 to execute instructions to complete all or part of the steps of the above audio playback method.
  • processing component 802 may include one or more modules that facilitate interaction between processing component 802 and other components.
  • processing component 802 may include a multimedia module to facilitate interaction between multimedia component 808 and processing component 802.
  • Memory 804 is configured to store various types of data to support operations at device 800 . Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, etc.
  • Memory 804 may be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EEPROM), Programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read-only memory
  • EEPROM erasable programmable read-only memory
  • EPROM Programmable read-only memory
  • PROM programmable read-only memory
  • ROM read-only memory
  • magnetic memory flash memory, magnetic or optical disk.
  • Power supply component 806 provides power to the various components of device 800.
  • Power supply components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to device 800 .
  • Multimedia component 808 includes a screen that provides an output interface between the device 800 and the user.
  • the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide action.
  • multimedia component 808 includes a front-facing camera and/or a rear-facing camera.
  • the front camera and/or the rear camera may receive external multimedia data.
  • Each front-facing camera and rear-facing camera can be a fixed optical lens system or have a focal length and optical zoom capabilities.
  • Audio component 810 is configured to output and/or input audio signals.
  • audio component 810 includes a microphone (MIC) configured to receive external audio signals when device 800 is in operating modes, such as call mode, recording mode, and speech recognition mode. The received audio signal may be further stored in memory 804 or sent via communication component 816 .
  • audio component 810 also includes a speaker for outputting audio signals.
  • the input/output interface 812 provides an interface between the processing component 802 and a peripheral interface module, which may be a keyboard, a click wheel, a button, etc. These buttons may include, but are not limited to: Home button, Volume buttons, Start button, and Lock button.
  • Sensor component 814 includes one or more sensors that provide various aspects of status assessment for device 800 .
  • the sensor component 814 can detect the open/closed state of the device 800, the relative positioning of components, such as the display and keypad of the device 800, and the sensor component 814 can also detect a change in position of the device 800 or a component of the device 800. , the presence or absence of user contact with the device 800 , device 800 orientation or acceleration/deceleration and temperature changes of the device 800 .
  • Sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
  • Sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor component 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 816 is configured to facilitate wired or wireless communication between apparatus 800 and other devices.
  • Device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof.
  • the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communications component 816 also includes a near field communications (NFC) module to facilitate short-range communications.
  • NFC near field communications
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • apparatus 800 may be configured by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable Gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are implemented for executing the above audio playback method.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable Gate array
  • controller microcontroller, microprocessor or other electronic components are implemented for executing the above audio playback method.
  • a non-transitory computer-readable storage medium including instructions such as a memory 804 including instructions, is also provided, and the instructions can be executed by the processor 820 of the device 800 to complete the above audio playback method.
  • the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
  • a computer program product comprising a computer program executable by a programmable device, the computer program having a function for performing the above when executed by the programmable device.
  • the code part of the audio playback method.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)

Abstract

本公开涉及一种音频播放方法、装置及存储介质,所述方法应用于终端,包括:确定所述终端播放的音频数据对应的音频类型;根据所述音频类型,确定所述音频数据对应的音频控制信息,所述音频控制信息包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息;根据所述声场控制信息和所述虚拟扬声器分布信息,播放所述音频数据。本公开可以根据音频数据对应的音频类型,确定该音频数据对应的音频控制信息,该音频控制信息包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息,这样,能够按照不同的虚拟声场播放不同音频类型的音频数据,使得播放的音频数据的空间感知质量更高,从而提高了3D音效的效果。

Description

音频播放方法、装置及存储介质 技术领域
本公开涉及音频技术领域,尤其涉及一种音频播放方法、装置及存储介质。
背景技术
近年来,耳机已经逐渐成为大众播放音频的首选方式。同时,随着音频技术的发展,大众对耳机重放音频的要求不仅满足于听得见、听得清,还对声音的空间感提出了更高的要求。
相关技术中,利用多组头相关传递函数模拟多个虚拟扬声器,各个虚拟扬声器重放传统的多声道音源,使得听者通过普通的耳机即可感受到丰富的三维空间声场,从而实现基本的3D音效。但是,这种方式提供的3D音效的效果比较差,因此,如何提高3D音效的效果成为亟待解决的问题。
发明内容
为克服相关技术中存在的问题,本公开提供一种音频播放方法、装置及存储介质。
根据本公开实施例的第一方面,提供一种音频播放方法,应用于终端,包括:
确定所述终端播放的音频数据对应的音频类型;
根据所述音频类型,确定所述音频数据对应的音频控制信息,所述音频控制信息包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息;
根据所述声场控制信息和所述虚拟扬声器分布信息,播放所述音频数据。
根据本公开实施例的第二方面,提供一种音频播放装置,应用于终端,包括:
类型确定模块,被配置为确定所述终端播放的音频数据对应的音频类型;
信息确定模块,被配置为根据所述音频类型,确定所述音频数据对应的音频控制信息,所述音频控制信息包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息;
播放模块,被配置为根据所述声场控制信息和所述虚拟扬声器分布信息,播放所述音频数据。
根据本公开实施例的第三方面,提供一种音频播放装置,包括:
处理器;
用于存储处理器可执行指令的存储器;
其中,所述处理器被配置为:
确定终端播放的音频数据对应的音频类型;
根据所述音频类型,确定所述音频数据对应的音频控制信息,所述音频控制信息包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息;
根据所述声场控制信息和所述虚拟扬声器分布信息,播放所述音频数据。
根据本公开实施例的第四方面,提供一种计算机可读存储介质,其上存储有计算机程序指令,该程序指令被处理器执行时实现本公开第一方面所提供的音频播放方法的步骤。
本公开的实施例提供的技术方案可以包括以下有益效果:通过确定所述终端播放的音频数据对应的音频类型;根据所述音频类型,确定所述音频数据对应的音频控制信息,所述音频控制信息包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息;根据所述声场控制信息和所述虚拟扬声器分布信息,播放所述音频数据。也就是说,本公开可以根据音频数据对应的音频类型,确定该音频数据对应的音频控制信息,该音频控制信息包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息,这样,能够按照不同的虚拟声场播放不同音频类型的音频数据,使得播放的音频数据的空间感 知质量更高,从而提高了3D音效的效果。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。
图1是根据一示例性实施例示出的一种音频播放方法的流程图;
图2是根据一示例性实施例示出的一种虚拟扬声器分布图;
图3是根据一示例性实施例示出的另一种虚拟扬声器分布图;
图4是根据一示例性实施例示出的另一种音频播放方法的流程图;
图5是根据一示例性实施例示出的另一种音频播放方法的流程图;
图6是根据一示例性实施例示出的另一种音频播放方法的流程图;
图7是根据一示例性实施例示出的另一种音频播放方法的流程图;
图8是根据一示例性实施例示出的另一种音频播放方法的流程图;
图9是根据一示例性实施例示出的一种音频播放装置的框图;
图10是根据一示例性实施例示出的另一种音频播放装置的框图;
图11是根据一示例性实施例示出的另一种音频播放装置的框图;
图12是根据一示例性实施例示出的另一种音频播放装置的框图;
图13是根据一示例性实施例示出的一种装置的框图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所 有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。
需要说明的是,本申请中所有获取信号、信息或数据的动作都是在遵照所在地国家相应的数据保护法规政策的前提下,并获得由相应装置所有者给予授权的情况下进行的。
在本公开中,使用的术语如“第一”、“第二”等是用于区别类似的对象,而不必理解为特定的顺序或先后次序。另外,在未作相反说明的情况下,在参考附图的描述中,不同附图中的同一标记表示相同的要素。
在本公开的描述中,除非另有说明,“多个”是指两个或多于两个,其它量词与之类似;“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个;“和/或”是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A,B可以是单数或者复数。
在本公开实施例中尽管在附图中以特定的顺序描述操作,但是不应将其理解为要求按照所示的特定顺序或是串行顺序来执行这些操作,或是要求执行全部所示的操作以得到期望的结果。在特定环境中,多任务和并行处理可能是有利的。
在详细介绍本公开的具体实施方式之前,首先对本公开的应用场景进行说明。空间音频技术是指通过耳机或者扬声器重放声音,且能够使得听者感知到声音空间属性的技术。相关技术中,利用多组头相关传递函数模拟多个虚拟扬声器,各个虚拟扬声器重放传统的多声道音源,以使听者能够感知到多个虚拟声源,实现基本的3D音效。但是,这种方式生成的虚拟声源的位置是固定的,导致3D音效的效果比较差。
为了解决上述问题,本公开提供了一种音频播放方法、装置及存储介质,可以根据音频数据对应的音频类型,确定该音频数据对应的音频控制信息,该音频控制信息包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息,这样,能够按照不同的虚拟声场播放不同音频类型的音频数据,使得播放的音频数据的空间感知质量更高,从而提高了3D音效的效果。
下面结合具体实施例对本公开进行说明。
图1是根据一示例性实施例示出的一种音频播放方法的流程图,如图1所示,该方法用于终端,可以包括:
S101、确定终端播放的音频数据对应的音频类型。
其中,该音频数据可以是该终端播放的目标音频中的每一帧数据,该目标音频可以是任意格式的音频。该音频类型可以包括音乐、电影、游戏、语音等,本公开对此不作限定。
在本步骤中,可以根据该音频数据对应的通路信息或者音频信息,确定该音频数据对应的音频类型,也可以对该音频数据进行时频域处理,得到该音频数据对应的音频类型,本公开对确定该音频类型的具体方式不作限定。
S102、根据该音频类型,确定该音频数据对应的音频控制信息。
其中,该音频控制信息可以包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息,该声场控制信息可以包括音效控制策略,例如低频增强、上混、下混等,该虚拟扬声器分布信息可以包括虚拟扬声器的数量、每个虚拟扬声器的位置信息和方向信息。
在本步骤中,针对不同的音频类型,可以通过试验预先设置不同的音频控制信息,在确定该音频数据对应的音频类型后,可以从预先设置的多个音频控制信息中确定该音频类型对应的音频控制信息。
示例地,在该音频类型为音乐的情况下,图2是根据一示例性实施例示出的一种虚拟扬声器分布图,如图2所示,包括5个虚拟扬声器,中心的圆 点为使用该终端的用户的头部的位置,虚拟扬声器1与虚拟扬声器2之间的夹角为30度,虚拟扬声器2与虚拟扬声器3之间的夹角为30度,虚拟扬声器3与虚拟扬声器5之间的夹角为90度,虚拟扬声器1与虚拟扬声器4之间的夹角为90度。在该音频类型为电影的情况下,图3是根据一示例性实施例示出的另一种虚拟扬声器分布图,如图3所示,也包括5个虚拟扬声器,中心的圆点为使用该终端的用户的头部的位置,虚拟扬声器1与虚拟扬声器2之间的夹角为60度,虚拟扬声器2与虚拟扬声器3之间的夹角为60度,虚拟扬声器3与虚拟扬声器5之间的夹角为50度,虚拟扬声器1与虚拟扬声器4之间的夹角为50度。
需要说明的是,上述虚拟扬声器分布图只是举例说明,针对不同的终端和用户,该虚拟扬声器分布图可以不同,本公开对此不作限定。
S103、根据该声场控制信息和该虚拟扬声器分布信息,播放该音频数据。
在本步骤中,在确定该音频数据对应的音频控制信息后,可以根据该声场控制信息对该音频数据进行音频处理,之后,再按照该虚拟扬声器分布信息,基于现有技术的头相关传递函数进行双耳渲染,播放该音频数据。
采用上述方法,可以根据音频数据对应的音频类型,确定该音频数据对应的音频控制信息,该音频控制信息包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息,这样,能够按照不同的虚拟声场播放不同音频类型的音频数据,使得播放的音频数据的空间感知质量更高,从而提高了3D音效的效果。
图4是根据一示例性实施例示出的另一种音频播放方法的流程图,如图4所示,步骤S101的实现方式可以包括:
S1011、根据该音频数据对应的通路信息,确定该音频数据对应的第一音频类型。
通常情况下,在该终端播放音频数据时,可以先确定该音频数据对应的 通路信息,示例地,该通路信息可以包括音乐通路、视频通路、通知通路、语音通路等,之后,可以按照该通路信息播放该音频数据。由于不同音频类型的音频数据所使用的通路不同,通过播放该音频数据的通路信息,即可确定该音频数据对应的第一音频类型。示例地,若该音频数据对应的通路信息为音乐通路,则可以确定该音频数据对应的第一音频类型为音乐。
S1012、根据该音频数据对应的音频信息,确定该音频数据对应的第二音频类型。
示例地,该音频信息可以包括采样率、声道信息等与该音频数据自身属性相关的元信息。通过预先创建的类型关联关系,可以确定该音频数据对应的第二音频类型,该类型关联关系可以包括不同的音频信息与音频类型之间的对应关系。例如,若该音频信息为采样率,该音频数据对应的采样率为44.1kHz,则可以确定该音频类型为音乐。若该音频信息为声道信息,该音频数据对应的声道信息为单声道,则可以确定该音频数据对应的第二音频类型为语音。需要说明的是,上述第二音频类型的确定方式只是举例说明,本公开对此不作限定。
S1013、对该音频数据进行时频域处理,得到该音频数据对应的第三音频类型。
示例地,可以通过现有的时频域分析算法,对该音频数据进行时频域分析处理,确定该音频数据对应的第三音频数据,此处不再赘述。
S1014、根据该第一音频类型、该第二音频类型以及该第三音频类型,确定该音频数据对应的音频类型。
在一种可能的实现方式中,在确定该第一音频类型、该第二音频类型以及该第三音频类型后,可以获取第一预设权重、第二预设权重以及第三预设权重,根据该第一音频类型、该第二音频类型、该第三音频类型、该第一预设权重、该第二预设权重以及该第三预设权重,确定该音频数据对应的音频 类型。其中,该第一预设权重、该第二预设权重以及该第三预设权重可以根据经验预先设置,例如,该第一预设权重可以是0.5,该第二预设权重可以是0.3,该第三预设权重可以是0.2。
示例地,若该第一音频类型为音乐、该第二音频类型为电影、该第三音频类型为游戏,该第一预设权重为0.5,该第二预设权重为0.3,该第三预设权重为0.2,则可以确定该音频数据对应的音频类型为音乐。若该第一音频类型为音乐、该第二音频类型为电影、该第三音频类型为电影,该第一预设权重为0.4,该第二预设权重为0.3,该第三预设权重为0.3,则可以确定该音频数据对应的音频类型为电影。
上述方法通过不同方式确定该音频数据对应的第一音频类型、第二音频类型以及第三音频类型,再结合该第一音频类型、该第二音频类型以及该第三音频类型确定该音频数据对应的音频类型,使得最终确定的该音频类型的准确率更高,进一步提高了3D音效的效果。
在一种可能的实现方式中,步骤S102可以通过以下方式来实现:
通过预先设置的控制信息关联关系,确定该音频类型对应的音频控制信息,该控制信息关联关系包括不同的音频类型与音频控制信息之间的对应关系。
示例地,针对每个音频类型,本公开可以通过试验预先确定该音频类型对应的音频控制信息,以得到该控制信息关联关系。
图5是根据一示例性实施例示出的另一种音频播放方法的流程图,如图5所示,该方法还可以包括:
S104、确定该音频数据对应的声道信息。
其中,该声道信息可以是单声道、双声道、5.1声道等,本公开对此不作限定。
相应的,步骤S102可以为:
根据该音频类型和该声道信息,确定该音频数据对应的音频控制信息。
其中,该音频控制信息可以包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息,该声场控制信息可以包括音效控制策略,例如低频增强、上混、下混等,该虚拟扬声器分布信息可以包括虚拟扬声器的数量、每个虚拟扬声器的位置信息和方向信息。
在确定该音频数据对应的音频类型后,可以先根据该音频类型,确定该音频数据对应的虚拟扬声器分布信息,之后,可以确定该音频数据对应的声道信息,根据该声道信息和该虚拟扬声器分布信息,确定该音频数据对应的声场控制信息。示例地,若该声道信息为双声道,该虚拟扬声器的数量为5个,则可以确定该声场控制信息为将双声道上混为5声道;若该声道信息为7.1声道,该虚拟扬声器的数量为5个,则可以确定该声场控制信息为将7.1声道下混为5声道。
图6是根据一示例性实施例示出的另一种音频播放方法的流程图,如图6所示,步骤S103的实现方式可以包括:
S1031、根据该声场控制信息,对该音频数据进行音频处理,得到目标音频数据。
示例地,若该声场控制信息为将双声道上混为5声道,则可以通过现有技术的方法对该音频数据进行上混处理,得到5声道的目标音频数据;若该声场控制信息为将7.1声道下混为5声道,则可以通过现有技术的方法对该音频数据进行下混处理,得到5声道的目标音频数据。
S1032、按照该虚拟扬声器分布信息,播放该目标音频数据。
在得到该目标音频数据后,可以按照该虚拟扬声器分布信息,将该目标音频数据的不同声道的数据传输至对应的虚拟扬声器,基于现有的头相关传递函数进行双耳渲染,通过多个虚拟扬声器播放该目标音频数据。
在一种可能的实现方式中,在按照该虚拟扬声器分布信息,播放该目标 音频数据前,可以确定使用该终端的用户的头部位置信息,根据该头部位置信息,确定每个该虚拟扬声器的方向信息。之后,按照该虚拟扬声器分布信息和每个该虚拟扬声器的方向信息,播放该目标音频数据。
示例地,可以通过现有技术的方法确定使用该终端的用户的头部位置信息,此处不再赘述,该头部位置信息可以是世界坐标系中的坐标信息。在确定该头部位置信息后,针对每个虚拟扬声器,可以将正对该头部位置信息的方向作为该虚拟扬声器的方向信息。之后,可以按照每个虚拟扬声器的位置信息和方向信息,基于现有的头相关传递函数进行双耳渲染,通过多个虚拟扬声器播放该目标音频数据。
图7是根据一示例性实施例示出的另一种音频播放方法的流程图,如图7所示,该方法还可以包括:
S105、确定该终端是否开启声场调节模式。
相应的,步骤S103可以为:
在该终端开启该声场调节模式的情况下,根据该声场控制信息和该虚拟扬声器分布信息播放该音频数据。
示例地,用户可以通过该终端的设置模块中的声场调节菜单开启或者关闭该声场调节模式,在根据该声场控制信息和该虚拟扬声器分布信息播放该音频数据前,可以先确定该终端的声场调节模式是否开启,若该终端的声场调节模式开启,则可以根据该声场控制信息和该虚拟扬声器分布信息播放该音频数据;若该终端的声场调节模式未开启,则不对该音频数据的声场进行调节,通过该终端上安装的扬声器播放该音频数据。这样,用户可以灵活选择是否进行声场调节,提高了用户体验。
图8是根据一示例性实施例示出的另一种音频播放方法的流程图,如图8所示,该方法可以包括:
S801、根据终端播放的音频数据对应的通路信息,确定该音频数据对应 的第一音频类型。
其中,该音频数据可以是该终端播放的目标音频中的每一帧数据,该目标音频可以是任意格式的音频。该音频类型可以包括音乐、电影、游戏、语音等,本公开对此不作限定。
S802、根据该音频数据对应的音频信息,确定该音频数据对应的第二音频类型。
S803、对该音频数据进行时频域处理,得到该音频数据对应的第三音频类型。
S804、根据该第一音频类型、该第二音频类型以及该第三音频类型,确定该音频数据对应的音频类型。
在本步骤中,可以获取第一预设权重、第二预设权重以及第三预设权重,根据该第一音频类型、该第二音频类型、该第三音频类型、该第一预设权重、该第二预设权重以及该第三预设权重,确定该音频数据对应的音频类型。
S805、确定该音频数据对应的声道信息。
S806、根据该音频类型和该声道信息,确定该音频数据对应的音频控制信息。
其中,该音频控制信息可以包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息,该声场控制信息可以包括音效控制策略,例如低频增强、上混、下混等,该虚拟扬声器分布信息可以包括虚拟扬声器的数量、每个虚拟扬声器的位置信息和方向信息。
S807、根据该声场控制信息,对该音频数据进行音频处理,得到目标音频数据。
S808、按照该虚拟扬声器分布信息,播放该目标音频数据。
在本步骤中,在得到该目标音频数据后,可以确定使用该终端的用户的头部位置信息,根据该头部位置信息,确定每个该虚拟扬声器的方向信息, 之后,可以按照该虚拟扬声器分布信息和每个该虚拟扬声器的方向信息,播放该目标音频数据。
采用上述方法,可以按照不同的虚拟声场播放不同音频类型的音频数据,使得播放的音频数据的空间感知质量更高,从而提高了3D音效的效果。
图9是根据一示例性实施例示出的一种音频播放装置的框图,如图9所示,该装置可以包括:
类型确定模块901,被配置为确定该终端播放的音频数据对应的音频类型;
信息确定模块902,被配置为根据该音频类型,确定该音频数据对应的音频控制信息,该音频控制信息包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息;
播放模块903,被配置为根据该声场控制信息和该虚拟扬声器分布信息,播放该音频数据。
可选地,该类型确定模块901,还被配置为:
根据该音频数据对应的通路信息,确定该音频数据对应的第一音频类型;
根据该音频数据对应的音频信息,确定该音频数据对应的第二音频类型;
对该音频数据进行时频域处理,得到该音频数据对应的第三音频类型;
根据该第一音频类型、该第二音频类型以及该第三音频类型,确定该音频数据对应的音频类型。
可选地,该类型确定模块901,还被配置为:
获取第一预设权重、第二预设权重以及第三预设权重;
根据该第一音频类型、该第二音频类型、该第三音频类型、该第一预设权重、该第二预设权重以及该第三预设权重,确定该音频数据对应的音频类型。
可选地,该信息确定模块902,还被配置为:
通过预先设置的控制信息关联关系,确定该音频类型对应的音频控制信 息,该控制信息关联关系包括不同的音频类型与音频控制信息之间的对应关系。
可选地,图10是根据一示例性实施例示出的另一种音频播放装置的框图,如图10所示,该装置还包括:
声道确定模块904,被配置为确定该音频数据对应的声道信息;
该信息确定模块902,还被配置为:
根据该音频类型和该声道信息,确定该音频数据对应的音频控制信息。
可选地,该播放模块903,还被配置为:
根据该声场控制信息,对该音频数据进行音频处理,得到目标音频数据;
按照该虚拟扬声器分布信息,播放该目标音频数据。
可选地,图11是根据一示例性实施例示出的另一种音频播放装置的框图,如图11所示,该装置还包括:
位置确定模块905,被配置为确定使用该终端的用户的头部位置信息;
方向确定模块906,被配置为根据该头部位置信息,确定每个该虚拟扬声器的方向信息;
该播放模块903,还被配置为:
按照该虚拟扬声器分布信息和每个该虚拟扬声器的方向信息,播放该目标音频数据。
可选地,图12是根据一示例性实施例示出的另一种音频播放装置的框图,如图12所示,该装置还包括:
模式确定模块907,被配置为确定该终端是否开启声场调节模式;
该播放模块903,还被配置为:
在该终端开启该声场调节模式的情况下,根据该声场控制信息和该虚拟扬声器分布信息播放该音频数据。
通过上述装置,可以根据音频数据对应的音频类型,确定该音频数据对 应的音频控制信息,该音频控制信息包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息,这样,能够按照不同的虚拟声场播放不同音频类型的音频数据,使得播放的音频数据的空间感知质量更高,从而提高了3D音效的效果。
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
本公开还提供一种计算机可读存储介质,其上存储有计算机程序指令,该程序指令被处理器执行时实现本公开提供的音频播放方法的步骤。
图13是根据一示例性实施例示出的一种装置800的框图。例如,装置800可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。
参照图13,装置800可以包括以下一个或多个组件:处理组件802,存储器804,电源组件806,多媒体组件808,音频组件810,输入/输出接口812,传感器组件814,以及通信组件816。
处理组件802通常控制装置800的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件802可以包括一个或多个处理器820来执行指令,以完成上述的音频播放方法的全部或部分步骤。此外,处理组件802可以包括一个或多个模块,便于处理组件802和其他组件之间的交互。例如,处理组件802可以包括多媒体模块,以方便多媒体组件808和处理组件802之间的交互。
存储器804被配置为存储各种类型的数据以支持在装置800的操作。这些数据的示例包括用于在装置800上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器804可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存 储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
电源组件806为装置800的各种组件提供电力。电源组件806可以包括电源管理系统,一个或多个电源,及其他与为装置800生成、管理和分配电力相关联的组件。
多媒体组件808包括在所述装置800和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件808包括一个前置摄像头和/或后置摄像头。当装置800处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。
音频组件810被配置为输出和/或输入音频信号。例如,音频组件810包括一个麦克风(MIC),当装置800处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器804或经由通信组件816发送。在一些实施例中,音频组件810还包括一个扬声器,用于输出音频信号。
输入/输出接口812为处理组件802和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。
传感器组件814包括一个或多个传感器,用于为装置800提供各个方面的状态评估。例如,传感器组件814可以检测到装置800的打开/关闭状态, 组件的相对定位,例如所述组件为装置800的显示器和小键盘,传感器组件814还可以检测装置800或装置800一个组件的位置改变,用户与装置800接触的存在或不存在,装置800方位或加速/减速和装置800的温度变化。传感器组件814可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件814还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件814还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。
通信组件816被配置为便于装置800和其他设备之间有线或无线方式的通信。装置800可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件816经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件816还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
在示例性实施例中,装置800可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述音频播放方法。
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器804,上述指令可由装置800的处理器820执行以完成上述音频播放方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
在另一示例性实施例中,还提供一种计算机程序产品,该计算机程序产 品包含能够由可编程的装置执行的计算机程序,该计算机程序具有当由该可编程的装置执行时用于执行上述的音频播放方法的代码部分。
本领域技术人员在考虑说明书及实践本公开后,将容易想到本公开的其它实施方案。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。

Claims (11)

  1. 一种音频播放方法,其特征在于,应用于终端,包括:
    确定所述终端播放的音频数据对应的音频类型;
    根据所述音频类型,确定所述音频数据对应的音频控制信息,所述音频控制信息包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息;
    根据所述声场控制信息和所述虚拟扬声器分布信息,播放所述音频数据。
  2. 根据权利要求1所述的方法,其特征在于,所述确定所述终端播放的音频数据对应的音频类型包括:
    根据所述音频数据对应的通路信息,确定所述音频数据对应的第一音频类型;
    根据所述音频数据对应的音频信息,确定所述音频数据对应的第二音频类型;
    对所述音频数据进行时频域处理,得到所述音频数据对应的第三音频类型;
    根据所述第一音频类型、所述第二音频类型以及所述第三音频类型,确定所述音频数据对应的音频类型。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述第一音频类型、所述第二音频类型以及所述第三音频类型,确定所述音频数据对应的音频类型包括:
    获取第一预设权重、第二预设权重以及第三预设权重;
    根据所述第一音频类型、所述第二音频类型、所述第三音频类型、所述第一预设权重、所述第二预设权重以及所述第三预设权重,确定所述音频数据对应的音频类型。
  4. 根据权利要求1所述的方法,其特征在于,所述根据所述音频类型,确定所述音频数据对应的音频控制信息包括:
    通过预先设置的控制信息关联关系,确定所述音频类型对应的音频控制信息,所述控制信息关联关系包括不同的音频类型与音频控制信息之间的对应关系。
  5. 根据权利要求1所述的方法,其特征在于,在所述根据所述音频类型,确定所述音频数据对应的音频控制信息前,所述方法还包括:
    确定所述音频数据对应的声道信息;
    所述根据所述音频类型,确定所述音频数据对应的音频控制信息包括:
    根据所述音频类型和所述声道信息,确定所述音频数据对应的音频控制信息。
  6. 根据权利要求5所述的方法,其特征在于,所述根据所述声场控制信息和所述虚拟扬声器分布信息,播放所述音频数据包括:
    根据所述声场控制信息,对所述音频数据进行音频处理,得到目标音频数据;
    按照所述虚拟扬声器分布信息,播放所述目标音频数据。
  7. 根据权利要求6所述的方法,其特征在于,在所述按照所述虚拟扬声器分布信息,播放所述目标音频数据前,所述方法还包括:
    确定使用所述终端的用户的头部位置信息;
    根据所述头部位置信息,确定每个所述虚拟扬声器的方向信息;
    所述按照所述虚拟扬声器分布信息,播放所述目标音频数据包括:
    按照所述虚拟扬声器分布信息和每个所述虚拟扬声器的方向信息,播放 所述目标音频数据。
  8. 根据权利要求1-7任一项所述的方法,其特征在于,在所述根据所述声场控制信息和所述虚拟扬声器分布信息,播放所述音频数据前,所述方法还包括:
    确定所述终端是否开启声场调节模式;
    所述根据所述声场控制信息和所述虚拟扬声器分布信息,播放所述音频数据包括:
    在所述终端开启所述声场调节模式的情况下,根据所述声场控制信息和所述虚拟扬声器分布信息播放所述音频数据。
  9. 一种音频播放装置,其特征在于,应用于终端,包括:
    类型确定模块,被配置为确定所述终端播放的音频数据对应的音频类型;
    信息确定模块,被配置为根据所述音频类型,确定所述音频数据对应的音频控制信息,所述音频控制信息包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息;
    播放模块,被配置为根据所述声场控制信息和所述虚拟扬声器分布信息,播放所述音频数据。
  10. 一种音频播放装置,其特征在于,包括:
    处理器;
    用于存储处理器可执行指令的存储器;
    其中,所述处理器被配置为:
    确定终端播放的音频数据对应的音频类型;
    根据所述音频类型,确定所述音频数据对应的音频控制信息,所述音频 控制信息包括声场控制信息和多个虚拟扬声器对应的虚拟扬声器分布信息;
    根据所述声场控制信息和所述虚拟扬声器分布信息,播放所述音频数据。
  11. 一种计算机可读存储介质,其上存储有计算机程序指令,其特征在于,该程序指令被处理器执行时实现权利要求1-8中任一项所述方法的步骤。
PCT/CN2022/098751 2022-06-14 2022-06-14 音频播放方法、装置及存储介质 WO2023240467A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2022/098751 WO2023240467A1 (zh) 2022-06-14 2022-06-14 音频播放方法、装置及存储介质
CN202280004311.8A CN117597945A (zh) 2022-06-14 2022-06-14 音频播放方法、装置及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/098751 WO2023240467A1 (zh) 2022-06-14 2022-06-14 音频播放方法、装置及存储介质

Publications (1)

Publication Number Publication Date
WO2023240467A1 true WO2023240467A1 (zh) 2023-12-21

Family

ID=89192961

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/098751 WO2023240467A1 (zh) 2022-06-14 2022-06-14 音频播放方法、装置及存储介质

Country Status (2)

Country Link
CN (1) CN117597945A (zh)
WO (1) WO2023240467A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105120421A (zh) * 2015-08-21 2015-12-02 北京时代拓灵科技有限公司 一种生成虚拟环绕声的方法和装置
CN107493542A (zh) * 2012-08-31 2017-12-19 杜比实验室特许公司 用于在听音环境中播放音频内容的扬声器系统
US20180332424A1 (en) * 2017-05-12 2018-11-15 Microsoft Technology Licensing, Llc Spatializing audio data based on analysis of incoming audio data
CN113038343A (zh) * 2019-12-09 2021-06-25 三星电子株式会社 音频输出装置及其控制方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107493542A (zh) * 2012-08-31 2017-12-19 杜比实验室特许公司 用于在听音环境中播放音频内容的扬声器系统
CN105120421A (zh) * 2015-08-21 2015-12-02 北京时代拓灵科技有限公司 一种生成虚拟环绕声的方法和装置
US20180332424A1 (en) * 2017-05-12 2018-11-15 Microsoft Technology Licensing, Llc Spatializing audio data based on analysis of incoming audio data
CN113038343A (zh) * 2019-12-09 2021-06-25 三星电子株式会社 音频输出装置及其控制方法

Also Published As

Publication number Publication date
CN117597945A (zh) 2024-02-23

Similar Documents

Publication Publication Date Title
CN108141696B (zh) 用于空间音频调节的系统和方法
KR102538775B1 (ko) 오디오 재생 방법 및 오디오 재생 장치, 전자 기기 및 저장 매체
KR101405646B1 (ko) 휴대용 통신 디바이스 및 지향된 사운드 출력을 이용한 통신 가능화
US9374647B2 (en) Method and apparatus using head movement for user interface
CN106454644B (zh) 音频播放方法及装置
JP5453297B2 (ja) オーディオミクスチャ内での音源に関する別個の知覚位置を提供する方法および装置
JP6186518B2 (ja) 音声通話プロンプト方法、装置、プログラム及び記録媒体
US20230195412A1 (en) Augmenting control sound with spatial audio cues
TWI648994B (zh) 一種獲得空間音訊定向向量的方法、裝置及設備
WO2017016109A1 (zh) 事件提醒方法及装置
US11736862B1 (en) Audio system and method of augmenting spatial audio rendition
US20110010627A1 (en) Spatial user interface for audio system
US20120317594A1 (en) Method and system for providing an improved audio experience for viewers of video
EP4007999A1 (en) Masa with embedded near-far stereo for mobile devices
CN111512648A (zh) 启用空间音频内容的渲染以用于由用户消费
WO2018058331A1 (zh) 控制音量的方法及装置
WO2023240467A1 (zh) 音频播放方法、装置及存储介质
CN114339582B (zh) 双通道音频处理、方向感滤波器生成方法、装置以及介质
WO2022142254A1 (zh) 歌曲录制方法及存储介质
WO2023212883A1 (zh) 音频输出方法和装置、通信装置和存储介质
US11570565B2 (en) Apparatus, method, computer program for enabling access to mediated reality content by a remote user
EP4152770A1 (en) A method and apparatus for communication audio handling in immersive audio scene rendering
WO2023245390A1 (zh) 智能耳机的控制方法、装置、电子设备和存储介质
WO2024119946A1 (zh) 音频控制方法、音频控制装置、介质与电子设备
CN117319889A (zh) 音频信号的处理方法、装置、电子设备、及存储介质

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 202280004311.8

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22946166

Country of ref document: EP

Kind code of ref document: A1