WO2019062541A1 - 一种实时数字音频信号混音的方法及装置 - Google Patents

一种实时数字音频信号混音的方法及装置 Download PDF

Info

Publication number
WO2019062541A1
WO2019062541A1 PCT/CN2018/105037 CN2018105037W WO2019062541A1 WO 2019062541 A1 WO2019062541 A1 WO 2019062541A1 CN 2018105037 W CN2018105037 W CN 2018105037W WO 2019062541 A1 WO2019062541 A1 WO 2019062541A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
frame
data
real
audio data
Prior art date
Application number
PCT/CN2018/105037
Other languages
English (en)
French (fr)
Inventor
张硕
刘炜刚
韩晓征
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2019062541A1 publication Critical patent/WO2019062541A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K15/00Acoustics not otherwise provided for
    • G10K15/08Arrangements for producing a reverberation or echo sound
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/24Signal processing not specific to the method of recording or reproducing; Circuits therefor for reducing noise
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip

Definitions

  • the present application relates to the field of communications, and in particular, to a method and apparatus for audio signal processing.
  • Embodiments of the present application provide a method and apparatus for audio signal processing to implement flexible real-time digital mixing.
  • the first aspect of the present application provides a method for processing audio signals, including: acquiring at least one frame of audio data through a media play interface, where the media play interface is an application programming interface; and the at least one frame of audio data is passed through system software.
  • the data path is transferred to the digital mixing module; the real-time audio signal is obtained; and the at least one frame of audio data is mixed with the real-time audio signal in the digital mixing module to obtain a mixed audio signal.
  • the audio signal processing method mixes at least one frame of audio data acquired by the media play interface with a real-time audio signal derived from the audio input device in real time, and the media play interface acquires the audio file in units of frames instead of acquiring the entire audio file.
  • used to transmit at least one frame of audio data is the data path in the system software, and does not introduce external environmental noise.
  • the method before transmitting the at least one frame of audio data to the digital mixing module through the data path in the system software, the method further comprises: decoding the at least one frame of audio data from the compressed form to the original data. form.
  • the media play interface can be coupled to a music player.
  • the music player's music playing process is used to complete the decoding of at least one frame of audio data, and when any frame of audio data reaches the digital mixing module, the audio data of the original data form is already Compressed data, so the audio data is no longer needed to be decoded in the digital mixing module, improving the efficiency of real-time mixing.
  • the system software includes operating system software.
  • the system software may further include other driver software or platform software than the application software, such as open source system software, middleware, or widgets.
  • the data path includes at least one of: a track source node, an audio governor, an audio hardware abstraction layer, or a hardware driver layer.
  • the data path is a transmission path of the audio data inside the operating system, and the audio source node is a starting point of the multi-channel audio data flow to the plurality of different audio tracks;
  • the audio governor is the audio data processing and the system software.
  • the audio hardware abstraction layer is a software interface abstracted from an audio hardware device;
  • the hardware driver layer is a driver for an audio hardware device.
  • the playing of the at least one frame of audio data is prohibited during the transmission of the at least one frame of audio data to the digital mixing module through the data path in the system software.
  • At least one frame of audio data is played by a different playback process than the normal playback process.
  • the at least one frame of audio data is not outputted from the media player interface, but is directly sent to the digital mixing module.
  • the existing playback process completes decoding of at least one frame of audio data and avoids introducing unnecessary background noise.
  • disabling the playing of the at least one frame of audio data comprises at least one of: turning off an audio output data stream of the at least one frame of audio data through an audio controller in the data path; or based on the data path
  • the audio hardware abstraction layer in the control controls the audio output device in the data path to disable the audio output device of the at least one frame of audio data.
  • the method before acquiring the real-time audio signal, the method further includes: detecting whether the real-time audio signal input is; reducing the volume of the at least one frame of audio data when the real-time audio signal input is detected .
  • the method before acquiring the real-time audio signal, further comprises: acquiring a real-time analog audio signal, and converting the real-time analog audio signal into the real-time audio signal.
  • the above design is based on the presence or absence of the real-time audio signal to adaptively adjust the volume of at least one frame of audio data participating in the mix, which can highlight the real-time audio signal.
  • reducing the volume of the at least one frame of audio data comprises at least one of: reducing a volume of the at least one frame of audio data by controlling an audio controller in the data path; or by controlling the number
  • the mixing module reduces the volume of the at least one frame of audio data.
  • the method before the at least one frame of audio data is mixed with the real-time audio signal in the digital mixing module, the method further comprises: performing at least one of the following processing on the real-time audio signal: eliminating the signal mixture Stack, eliminate jitter, eliminate oversampling, noise suppression, echo cancellation, or gain control.
  • the sound quality of the acquired real-time audio signal can be improved, the noise contained in the real-time audio signal can be reduced, unnecessary interference can be introduced in the mixing process, and the mixing can be avoided. Audio overflow occurs during the process to avoid distortion of the mix audio.
  • the method further comprises: acquiring a video image signal; mixing the video image signal and the mixed audio signal to obtain a mixed video signal.
  • the method further includes at least one of: playing the mixed audio signal, storing the mixed audio signal locally, and transmitting the mixed audio signal to the Other devices or upload the mixed audio signal to the network.
  • the mixed audio signal After the mixed audio signal is obtained, it can be played in real time, or the mixed audio signal can be stored for subsequent playback, or shared to other end users or Internet users in real time.
  • the method further includes activating the digital mixing module via a digital mixing interface, the digital mixing interface being an application programming interface.
  • the second aspect of the present application provides an apparatus for audio signal processing, including: a media play interface, a data path in the system software, a real-time signal acquisition module, and a digital mixing module; the media play interface is configured to acquire at least a frame of audio data, the media playing interface is an application programming interface; the data path is configured to transmit the at least one frame of audio data to the digital mixing module; the real-time signal acquiring module is configured to acquire a real-time audio signal; And a digital mixing module, configured to mix the at least one frame of audio data with the real-time audio signal to obtain a mixed audio signal.
  • the apparatus further includes: a decoding module, configured to decode the at least one frame of audio data from the compressed form before the data path transmits the at least one frame of audio data to the digital mixing module Raw data form.
  • a decoding module configured to decode the at least one frame of audio data from the compressed form before the data path transmits the at least one frame of audio data to the digital mixing module Raw data form.
  • the system software includes operating system software.
  • the system software may further include other driver software or platform software than the application software, such as open source system software, middleware, or widgets.
  • the data path includes at least one of: a track source node, an audio governor, an audio hardware abstraction layer, or a hardware driver layer.
  • the data path is further configured to disable playback of the at least one frame of audio data during the transmitting of the at least one frame of audio data to the digital mixing module.
  • an audio governor in the data path is used to turn off an audio output data stream of the at least one frame of audio data; or an audio hardware abstraction layer in the data path is used to control the data path The hardware driver layer disables the audio output device of the at least one frame of audio data.
  • the device further includes: an audio detection module, configured to: detect whether a real-time audio signal is input before the real-time signal acquisition module acquires the real-time audio signal; when the real-time audio signal input is detected Controlling reduces the volume of the at least one frame of audio data.
  • an audio detection module configured to: detect whether a real-time audio signal is input before the real-time signal acquisition module acquires the real-time audio signal; when the real-time audio signal input is detected Controlling reduces the volume of the at least one frame of audio data.
  • the audio detection module is configured to perform at least one of: transmitting the control signal to the audio controller in the data path to reduce the volume of the at least one frame of audio data, for example, reducing the transmission.
  • a control signal of the volume is sent to the audio controller; or the control signal is sent to the digital mixing module to reduce the volume of the at least one frame of audio data, for example, a control signal for reducing the volume is sent to the digital mixing module.
  • the apparatus further includes a pre-processing module, configured to: perform at least one of the following processing on the real-time audio signal: canceling signal aliasing, eliminating jitter, eliminating oversampling, noise suppression, echo cancellation, or gain control.
  • the apparatus further includes: a video processing module, configured to: acquire a video image signal; and mix the video image signal with the mixed audio signal to obtain a mixed video signal.
  • a video processing module configured to: acquire a video image signal; and mix the video image signal with the mixed audio signal to obtain a mixed video signal.
  • the apparatus further includes at least one module: a playing module for playing the mixed audio signal; a storage module for storing the mixed audio signal; and a transmitting module for mixing the sound
  • the audio signal is sent to other devices; or an upload module is used to upload the mixed audio signal to the network.
  • the apparatus further includes: a digital mixing interface for receiving the enabling information and forwarding the enabling information to activate the digital mixing module, the digital mixing interface being an application programming interface.
  • a third aspect of the present application provides an apparatus for audio signal processing, the apparatus comprising: a processor and an audio processor; the processor configured to read software instructions stored in the memory, execute the software instructions to: Obtaining at least one frame of audio data through a media play interface, where the media play interface is an application programming interface; transmitting the at least one frame of audio data to the audio processor through a data path in the system software; the audio processor is configured to: Acquiring a real-time audio signal; mixing the at least one frame of audio data with the real-time audio signal to obtain a mixed audio signal.
  • the apparatus further includes the memory.
  • the apparatus further includes: a decoder for decoding at least one frame of audio data from the compressed form to the original data form before the data path transmits the at least one frame of audio data to the audio processor .
  • the system software includes operating system software.
  • the system software may further include other driver software or platform software than the application software, such as open source system software, middleware, or widgets.
  • the data path includes at least one of: a track source node, an audio governor, an audio hardware abstraction layer, or a hardware driver layer.
  • the processor is configured to execute the software instructions to further implement an operation of disabling the at least one frame of audio data during the transmitting of the at least one frame of audio data to the digital mixing module Play.
  • the processor is configured to execute the software instructions to further implement: turning off an audio output data stream of the at least one frame of audio data by the audio governor; or controlling by the audio hardware abstraction layer
  • the hardware driver layer in the data path disables the audio output device of the at least one frame of audio data.
  • the audio processor is further configured to: detect whether there is a real-time audio signal input before acquiring the real-time audio signal; and control to reduce the at least one frame audio when the real-time audio signal input is detected The volume of the data.
  • the audio processor is specifically configured to: send the control signal to the audio controller in the data path to reduce the volume of the at least one frame of audio data, for example, send a control signal that reduces the volume. Giving the audio controller to the audio controller; or transmitting the control signal to the digital mixing module in the audio processor to reduce the volume of the at least one frame of audio data.
  • the audio processor is further configured to: perform at least one of the following processing on the real-time audio signal: canceling signal aliasing, eliminating jitter, eliminating oversampling, noise suppression, echo cancellation, or gain control.
  • the processor is configured to execute the software instructions to further perform the operations of: acquiring a video image signal; mixing the video image signal with the mixed audio signal to obtain a mixed video signal.
  • the processor is configured to execute the software instructions to further effect: playing the mixed audio signal, storing the mixed audio signal in the memory, and passing the mixed audio signal
  • the transmission interface is sent to other devices or the mixed audio signal is uploaded to the network through the network interface.
  • the network interface may be a wireless transceiver, a radio frequency (RF) circuit or a wired interface.
  • the transmission interface can be an input/output interface.
  • a fourth aspect of the present application provides a computer readable storage medium having stored therein instructions that, when executed on a computer or processor, cause the computer or processor to perform the first aspect as described above Or the method described in any of its possible designs.
  • a fifth aspect of the present application provides a computer program product comprising instructions which, when run on a computer or processor, cause the computer or processor to perform as in the first aspect described above or in any of its possible designs The method described.
  • a sixth aspect of the present application provides an apparatus comprising a processor for reading software instructions in a memory and executing the software instructions to effect execution of a first aspect as described above or any of its possible designs The method described.
  • the apparatus further includes the memory for storing the software instructions.
  • FIG. 1 is a schematic structural diagram of a device according to an embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of a peripheral device according to an embodiment of the present application.
  • FIG. 3 is a schematic diagram of an application scenario of a sound mixing technology according to an embodiment of the present disclosure
  • FIG. 4 is a structural block diagram of a specific architecture of a system software and corresponding hardware components according to an embodiment of the present application
  • FIG. 5 is a flowchart of a method for real-time digital mixing according to an embodiment of the present application
  • FIG. 6 is a schematic structural diagram of another apparatus according to an embodiment of the present disclosure.
  • FIG. 1 is a schematic diagram of an apparatus 100 according to an embodiment of the present disclosure.
  • the apparatus 100 may include an antenna system 110.
  • the antenna system 110 may be one or more antennas, and the antenna system 110 may also be composed of multiple antennas.
  • the device 100 can also include a radio frequency (RF) circuit 120, which can include one or more analog radio frequency transceivers 120, which can also include one or more digital radio frequency transceivers, the RF circuit 120 Coupled to antenna system 110.
  • RF radio frequency
  • coupling refers to interconnections in a specific manner, including being directly connected or indirectly connected through other devices, for example, through various types of interfaces, transmission lines, buses, and the like.
  • the radio frequency circuit 120 can be used for various types of cellular wireless communications.
  • the apparatus 100 can also include a processing system 130 that can include a communications processor that can be used to control the RF circuitry 120 to receive and transmit signals through the antenna system 110, which can be voice signals, media signals, or controls signal.
  • the communication processor in the processing system 130 can also be used to manage the above signals.
  • the signal management herein can include signal enhancement, signal filtering, codec, signal modulation, signal mixing, signal separation, or various other known signals. Processing, and new signal processing that may occur in the future.
  • the processing system 130 can include various general purpose processing devices, such as a central processing unit (CPU), a system on chip (SOC), a processor integrated on the SOC, and a separate processor chip.
  • the processing system 130 may further include a dedicated processing device, such as an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), or a Digital Signal Processor (Digital Signal). Processor, DSP), etc.
  • the processing system 130 can be a processor group of multiple processors coupled to one another via one or more buses.
  • the processing system may include an analog-to-digital converter (ADC), a digital-to-analog converter (DAC) to implement signal connection between different components of the device, for example,
  • ADC analog-to-digital converter
  • DAC digital-to-analog converter
  • the analog voice signal collected by the microphone is converted into a digital voice signal by the ADC and transmitted to the digital signal processor for processing, or the digital voice signal in the processor is converted into an analog voice signal through the DAC and played through the speaker.
  • the processing system can include a media processing system 131 for implementing processing of media signals such as images, audio, and video.
  • the media processing system 131 can include a sound processing system 132.
  • the sound processing system 132 It can be a general-purpose or dedicated sound processing device, for example, an audio processing subsystem integrated on the SOC, or a sound processing module integrated on the processor chip.
  • the sound processing module can be a software module or The hardware module, the sound processing module may also be an independently existing sound processing chip, and the sound processing system 132 is configured to implement related processing of the audio signal.
  • the apparatus 100 can also include a memory 140 coupled to the processing system 130, and in particular, the memory 140 can be coupled to the processing system 130 by one or more memory controllers.
  • the memory 140 can be used to store computer program instructions, including a computer operating system (OS) and various user applications, such as an audio processing program with a mixing function, an application with a live broadcast function, a video player, The music player and other possible applications; the memory 140 can also be used to store user data such as calendar information, contact information, acquired image information, audio information or other media files, and the like.
  • Processing system 130 may read computer program instructions or user data from memory 140 or store computer program instructions or user data into memory 140 to implement associated processing functions.
  • the audio file stored in the memory 140 can be read by the processing system and invoked by the music player, or the audio file in the memory 140 can be read into the processor for decoding, mixing, encoding, and the like.
  • the memory 140 may be a non-power-down volatile memory, such as an EMMC (Embedded Multi Media Card), a UFS (Universal Flash Storage), or a Read-Only Memory (ROM). Or other types of static storage devices that can store static information and instructions, and can also be volatile memory, such as random access memory (RAM) or other information and instructions.
  • EMMC embedded Multi Media Card
  • UFS Universal Flash Storage
  • ROM Read-Only Memory
  • static storage devices that can store static information and instructions, and can also be volatile memory, such as random access memory (RAM) or other information and instructions.
  • EEPROM Electrostatic Erasable Programmable Read-Only Memory
  • CD-ROM Compact Disc Read-Only Memory
  • optical disc Storage including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.
  • disk storage media or other magnetic storage devices or can be used to carry or store program code in the form of instructions or data structures and can be stored by a computer Any other computer readable storage medium taken, but is not limited thereto.
  • the memory 140 can be stand-alone, and the memory 140 can also be integrated with the processing system.
  • the device 100 may further include a wireless transceiver 150, which may provide wireless connection capabilities to other devices, such as a wireless headset, a Bluetooth headset, a wireless mouse, a wireless keyboard, etc., or a wireless network, such as Wireless Fidelity (WiFi) network, Wireless Personal Area Network (WPAN) or other Wireless Local Area Network (WLAN).
  • the wireless transceiver 150 can be a Bluetooth compatible transceiver for wirelessly coupling the processing system 130 to a peripheral device such as a Bluetooth headset, a wireless mouse, etc., the wireless transceiver 150 can also be a WiFi compatible transceiver for processing System 130 is wirelessly coupled to a wireless network or other device.
  • the Device 100 may also include an audio circuit 160 that is coupled to processing system 130.
  • the audio circuit 160 may include a microphone 161 and a speaker 162.
  • the microphone 161 may receive a sound input from the outside, and the sound input may be a user voice input, an external music input, a noise input, or other forms of external sound.
  • the microphone 161 may It is a built-in microphone integrated on the device 100, and may also be an external microphone coupled to the device 100 through an interface, for example, may be an earphone microphone coupled to the device through a headphone interface; the speaker 162 can realize playback of audio data, and the audio data can be From the microphone, it can also be a music file stored in the memory or an audio file processed by the processing system, wherein the speaker is a kind of audio transducer, which can enhance the audio signal, and the speaker can also be replaced with other forms of audio. Transducer. It should be understood that the device 100 may have one or more microphones, one or more earphones, and the number of the microphone and the earphone is not limited in the embodiment of the present application.
  • the processing system 130 drives or controls the audio circuitry through an audio controller (not shown in FIG. 1), specifically, enabling or disabling at least one of the microphones or speakers in accordance with instructions of the processing system 130, when it is desired to accept audio from the microphone
  • the processing system enables the microphone through the relevant control command and receives the audio signal input by the microphone.
  • the audio signal can be processed in the processing system 130, or sent to the memory 140 for storage, or played by the speaker, or
  • the RF circuit 120 is transmitted to the network or other device via the antenna system 110, or to the network or other device via the wireless transceiver 150; when the audio file needs to be played, the processing system enables the speaker through the associated control commands to effect playback of the audio signal.
  • the processing system enables the speaker through the associated control commands to effect playback of the audio signal.
  • the microphone and speaker are disabled by the relevant control commands.
  • the device 100 can also include a display screen 170 for displaying information entered by the user, various menus of information provided to the user, the menus being associated with specific modules or functions internal, and the display screen 170 can also accept user input, such as Accept control information such as enable or disable.
  • the display screen 170 may include a display panel 171 and a touch panel 172.
  • the display panel 171 can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), a light emitting diode (LED) display device, or a cathode ray tube (Cathode Ray). Tube, CRT, etc. to configure the display panel.
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • LED light emitting diode
  • Cathode Ray cathode ray tube
  • the touch panel 172 also referred to as a touch screen, a touch sensitive screen, etc., can collect contact or non-contact operations on or near the user (eg, the user uses any suitable object or accessory such as a finger, a stylus, etc. on the touch panel 172.
  • the operation in the vicinity of the touch panel 172 may also include a somatosensory operation; the operation includes a single-point control operation, a multi-point control operation, and the like, and drives the corresponding connection device according to a preset program.
  • the touch panel 172 can include two parts: a touch detection device and a touch controller.
  • the touch detection device detects a signal from a touch operation of the user, and transmits a signal to the touch controller; the touch controller receives the touch information from the touch detection device, and converts it into information that the processing system 130 can process, and then sends the information.
  • the system 130 is processed and can receive commands from the processing system 130 and execute them.
  • the touch panel 172 can cover the display panel 171, and the user can display the content according to the display panel 171 (the display content includes, but is not limited to, a soft keyboard, a virtual mouse, a virtual button, an icon, etc.) on the display panel 171. Operation is performed on or near the covered touch panel 172.
  • the touch panel 172 After the touch panel 172 detects an operation thereon or nearby, it is transmitted to the processing system 130 through the I/O subsystem 10 to determine user input, and then the processing system 130 is based on the user.
  • the input provides a corresponding visual output on display panel 171 via I/O subsystem 10.
  • the touch panel 172 and the display panel 171 are used as two separate components to implement the input and input functions of the device 100 in FIG. 1, in some embodiments, the touch panel 172 can be integrated with the display panel 171.
  • the input and output functions of the device 100 are implemented. The following is a description of the digital mixing operation.
  • the user touches the enable button associated with the digital mixing module on the display screen, and the touch detection device detects the enable signal brought by the touch operation and transmits the enable signal to the touch.
  • the controller converts the enable information into information that the processor can process and transmits to the processor in the processing system through the display controller 13 in the I/O subsystem 10, the processor receiving the enable information After the digital mixing module is activated and the digital mixing operation is completed, the processed mixed audio is sent to the player for playing, and the related information of the mixing audio playing is displayed through the display controller 13 in the I/O subsystem 10.
  • the related information may include information such as playing time, real-time lyrics, and the like.
  • the apparatus 100 may also include one or more sensors 180 coupled to the processing system 130, which may include an image sensor, a motion sensor, a proximity sensor, an environmental noise sensor, an acoustic sensor, an accelerometer, a temperature sensor, a gyroscope, or the like. Types of sensors, and combinations of their various forms.
  • Processing system 130 drives sensor 180 through sensor controller 12 in I/O subsystem 10 to receive various information such as audio signals, image signals, motion information, etc. Sensor 180 transmits the received information to processing system 130 for processing.
  • the device 100 may also include other input devices 190 coupled to the processing system 130 to receive various user inputs, such as receiving input numbers, names, addresses, and media selections, etc., which may be music or other audio files, various videos Formatted video files, still pictures, dynamic pictures, etc.
  • other input devices 190 may include a keyboard, physical buttons (press buttons, rocker buttons, etc.), dials, slide switches, joysticks, click wheels, light rats (light mouse is A touch-sensitive surface that does not display a visual output, or an extension of a touch-sensitive surface formed by a touch screen, and the like.
  • the apparatus 100 may also include the I/O subsystem 10 described above, which may include other input device controllers 11 for receiving signals from other input devices 190 or for transmitting control of the processing system 130 to other input devices 190.
  • the I/O subsystem 10 may also include the sensor controller 12 and display controller 13 described above for effecting the exchange of data and control information between the sensor 180 and the display screen 170 and the processing system 130, respectively.
  • the device 100 may also include a power source 101 to power other components of the device 100, including 110-190, which may be rechargeable or non-rechargeable lithium ion batteries or nickel metal hydride batteries.
  • a power source 101 to power other components of the device 100, including 110-190, which may be rechargeable or non-rechargeable lithium ion batteries or nickel metal hydride batteries.
  • the power source 101 when it is a rechargeable battery, it can be coupled to the processing system 130 through a power management system to implement functions such as managing charging, discharging, and power consumption adjustment through the power management system.
  • the device 100 may further include a camera for acquiring a single-frame image or a continuous multi-frame video image according to the working mode of the device 100, and transmitting the acquired image information to the processing system 130 for processing, specifically,
  • the processing system 130 may be integrated with an image processing unit or include a separate image processor or image processing chip.
  • the processing system 130 may further include a video codec.
  • the Video Codec module may be used to perform image signals and audio signals. The video signal is fused, and when the device works in the video shooting mode, the video image of the continuous multi-frame acquired by the camera and the audio signal obtained by the audio circuit 160 are fused in the Video Codec to obtain a video signal with sound.
  • the Video Codec may be a processing module integrated in the processing system 130, or may be a separate video codec chip.
  • the apparatus 100 in FIG. 1 is merely an example, and the specific form of the apparatus 100 is not limited, and the apparatus 100 may further include other components that are not shown in FIG. 1 or may be added in the future.
  • the RF circuit 120, the processing system 130, and the memory 140 may be partially or fully integrated on one chip, or may be three independent chips.
  • the RF circuit 120, the processing system 130, and the memory 140 may include one or more integrated circuits disposed on a Printed Circuit Board (PCB).
  • PCB Printed Circuit Board
  • the peripheral device 200 performs data exchange and signal interaction with the device 100.
  • the peripheral device may include a processor 210, a microphone 220, and an audio transducer. 230, a wireless transceiver 240 and one or more sensors 250.
  • the processor 210 is coupled to the microphone 220 and the audio transducer 230, the processor 210 can drive or disable the microphone 220 and the audio transducer 230, and the processor 210 can accept audio data from the microphone 220 or to the audio transducer 230 transmits audio data.
  • processor 210 can be coupled to an audio codec to effect signal exchange with microphone 220 and audio transducer 230.
  • audio transducer 230 can be a speaker.
  • the wireless transceiver 240 is also coupled to the processor 210.
  • the peripheral device 200 is implemented by the wireless transceiver 240 with the device 100 or other device having similar functions.
  • a wireless connection such as peripheral device 200, can transmit signals to or receive signals from device 100 via wireless transceiver 240 under the control of processor 210.
  • the processor 210 is coupled to one or more sensors 250 that can be used to detect user activity, various environmental information, and the like, the sensor 250 being identical or partially identical to the sensor 180 in the device 100, and the sensor 250 will detect the data. It is passed to the processor 210 for processing.
  • the peripheral device 200 shown in FIG. 2 may be a wireless headset that can exchange data with the device 100 via a Bluetooth interface. It should be understood that although FIG. 2 illustrates a particular form of peripheral device, this is not Limiting the form of the peripheral device, in some alternatives, the peripheral device 200 can also be a wired headset, a wired keyboard, a wireless keyboard, a wired or wireless cursor control device, and other wired or wireless input or output devices.
  • the system of device 100 and peripheral device 200 can be a variety of types of data processing systems, for example, the system can be an audio data processing system, an image data processing system, a temperature data processing system, or other data processing system.
  • FIG. 3 is a schematic diagram of an application scenario of a mixing technology according to an embodiment of the present disclosure, where 310 is a terminal including a mixing module, and the terminal 310 may be a specific form of the device 100 shown in FIG. 310 includes all or part of the structure of the device 100; 311 is a microphone, which may be a microphone that is provided by the terminal 310, or may be a separate microphone and implemented as a peripheral device 200 by means of wired communication or wireless communication with the terminal 310. Data exchange, for example, the microphone may be a microphone of a wireless earphone, a microphone on a wired earphone, or may be a sound sensitive device or an external sound card having a sound collection function.
  • the user can implement the scene recording, and combine the voice content shared by the user with the specific background music, thereby enhancing the interest of sharing the content.
  • the voice content shared by the user may be a song singing and speaking. Show, voice lectures or other forms of voice sharing.
  • the terminal 310 mixes the user voice collected by the microphone with the preset background music to obtain the mixed audio.
  • the user can locally play the obtained mixed audio through the speaker 312; optionally, the user can also obtain
  • the mixed audio data is uplinked to the network 320 based on the wireless communication technology, and the mixed audio data is shared to many Internet users through the network.
  • the Internet user can obtain the mixed audio from the network 320 through various devices having wireless communication functions. As shown in FIG.
  • the listener 1 acquires the mixed audio data transmitted by the network downlink through the smart phone 330; for example, the listener 2 Acquiring the audio audio data transmitted by the network downlink through the desktop computer 340 connected to the network; for example, the listener 3 acquires the audio audio data transmitted by the network downlink through the notebook computer 350.
  • the device used by the Internet user may also be an audio player with wireless communication capabilities, an in-vehicle device, a wearable device, a tablet, a Bluetooth headset, a radio, and the like.
  • the user may be stored in the storage device of the local terminal after obtaining the mixed audio data, for example, the storage device may be the memory 140 of the device 100, so that the user plays back or shares the stored audio audio when needed. data.
  • the user can implement a live web video
  • the live video content can be a dance performance, a craft display and an explanation, an online etiquette lecture, and other content that can be displayed through a video
  • the user can mix different according to the video content style. Background music to enhance the fun and entertainment of video content.
  • the terminal 310 acquires the image information of the user through the camera (not shown in FIG. 3), and acquires the voice information of the user through the microphone 311, and mixes the user voice collected by the microphone with the background music specified by the user in the mixing module of the terminal 310.
  • the audio is mixed, and further, the image information acquired by the camera and the obtained mixed audio information are fused in the video codec Video Codec module (not shown in FIG.
  • the user transmits the obtained mixed video data to the network 320 based on the wireless communication technology, and the Internet user can acquire the mixed video from the network 320 and view it through various devices having a wireless communication function and having a video playing function.
  • the user can view the effect of the mixed video through the local terminal 310; optionally, the user can save the recorded mixed video in the storage device of the terminal 310 for subsequent playback and sharing.
  • the wireless communication technology used in the above application scenarios may be various technologies that can provide services such as voice call, video, data, broadcast or other various cellular wireless communication services, such as Global System for Mobile (Global System for Mobile) Communication, GSM) technology, Code Division Multiple Access (CDMA) technology, Wideband Code Division Multiple Access (WCDMA) technology, Long Term Evolution (LTE) technology, the fifth generation of the future (5th Generation, 5G) Mobile communication technology or a publicly available Public Land Mobile Network (PLMN) technology.
  • GSM Global System for Mobile
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • LTE Long Term Evolution
  • 5G Fifth Generation
  • 5G Fifth Generation
  • PLMN Public Land Mobile Network
  • the method for obtaining the background music specified by the user is not limited in the above scenario.
  • the background music specified by the user may be externally placed into the environment, and enters the mixing module of the terminal 310 through the microphone as the user voice.
  • the user-specified background music can be read directly from the terminal's local memory without being externally placed.
  • the cached background music can also be obtained from the network.
  • FIG. 4 is a structural block diagram of a specific architecture of a system software and corresponding hardware components according to an embodiment of the present application.
  • the specific architecture of the system software and corresponding hardware components can implement a real-time digital mixing method.
  • the device 410 is an application processor (AP) in the processing system 130, or an application chip, and the application processor is used to run system software, for example, an application processor is used to run operating system software, for example,
  • the operating system software may be at least one of an Android Android system, an iOS system, or a Linux system; the system software may further include other driver software or platform software other than the application software (also called an application program), such as open source system software, and the middle. Pieces, or widgets, etc.
  • the application processor can also be used to run application related code, which can be a webcast platform, an audio player, a video player, a beauty camera or a communication capable application; optional, the application The processor supports application extensions; optionally, the application processor can be used to run user interface related code.
  • the memory 420 can be a memory 140 that can be used to store operating system software running in an application processor, application software, user interface related software, or other computer program code that can be executed by an application processor, and the memory can also store local audio. Files, local video files, user phonebooks, or other user data.
  • the touch screen 430 can be a display screen 170 that can be used to display various menus and function icons of the device associated with specific modules or functions within the device, for example, associated with the audio player APP on the touch screen.
  • the sound processing module 440 can be used to implement audio-related processing, such as an independent hardware processor that can be used to implement real-time mixing of multiple audio signals.
  • the sound processing module 440 can be a high-fidelity (HiFi) device.
  • HiFi high-fidelity
  • the sound processing module 440 may be a separate sound processing chip, or may be a processing module integrated in the application processor, or a processing module integrated in a processor chip outside the application processor, or the sound processing
  • the module 440 can be a software implementation running by the application processor, which is not limited in this embodiment.
  • the sound processing module 440 can be implemented by software. At this time, it is similar to an application layer or an application framework layer formed by the application processor, and is a processing unit formed by an instruction of the application processor to run the driver software.
  • the sound processing module 440 includes a digital mixing module 441 for implementing a real-time digital mixing algorithm.
  • the sound processing module 440 further includes a pre-processing module 442 for performing acoustic processing on the audio signal.
  • the processing that the pre-processing module 442 can perform includes eliminating signal aliasing, removing jitter, eliminating oversampling, noise suppression, echo cancellation, gain control, or other acoustic processing algorithms.
  • the audio codec 450 is configured to implement at least one of analog-to-digital conversion or digital-to-analog conversion of the audio signal.
  • the audio codec 450 may convert the audio data processed by the sound processing module into an analog signal and pass through the speaker 470, and the cable.
  • the earphone or the bluetooth earphone (not shown in FIG.
  • the audio codec 450 can convert the analog audio signal collected by the microphone 460 or other audio input device into a digital audio signal and transmit it to the sound processing module or Other processors perform various audio processing; the audio codec 450 may be a separate audio codec chip, or may be an audio codec module integrated in a processor chip or a HiFi chip.
  • the application processor 410, the sound processing module 440, and the audio codec 450 may collectively constitute the sound processing system 132 of FIG. 1 for implementing various forms of audio signal processing.
  • the application processor 410 can be used for operating system software, and a specific application framework of the operating system software of the audio and video architecture is shown in FIG. 4, for example, the application framework is an Android application framework. As shown in Figure 4, the application framework includes:
  • the APP layer is located at the top of the entire audio and video software architecture.
  • the layer is implemented based on a Java structure.
  • the APP layer includes an audio related application programming interface (API), a video related API, or other kinds of APIs.
  • API application programming interface
  • the API is bound to an internal specific application APP, and the control parameter is sent through the API interface to call the corresponding APP or receive the APP.
  • the return value is a code.
  • the framework of the application framework is the logical scheduling layer of the entire audio and video software architecture. It is the policy control center of the entire audio and video software architecture, which can schedule and policy the entire audio and video processing process.
  • the layer also includes some API interfaces for implementing audio and video data stream processing and control of audio and video hardware devices.
  • the core architecture of the layer is composed of at least one of Java, C++ or C.
  • HAL Hardware Abstraction Layer
  • the HAL layer is the interface layer of audio and video architecture operating system software and audio and video hardware devices. It provides an interface for interaction between the upper layer software and the underlying hardware.
  • the HAL layer abstracts the underlying hardware into software containing the corresponding hardware interface.
  • the underlying hardware device can be set.
  • the related hardware device can be enabled or disabled at the HAL layer.
  • the core architecture of the HAL layer is selected. It consists of at least one of C++ or C.
  • the core Kernel layer includes a hardware driver layer, and is used for direct control of the underlying hardware device according to control information input by the hardware abstraction layer, for example, driving or disabling the hardware device.
  • the core architecture of the Kernel layer is At least one of C++ or C.
  • the application processor 410 and the sound processing module 440 realize the interaction between the data and the control information through the relay communication layer.
  • the relay communication layer may be a Mailbox communication mechanism, and implement system software or application software of the application processor 410. Interaction with the sound processing module 440.
  • the Sound processing module 440 is formed by software instructions run by the application processor 410, the MailBox is an interface to the sound processing module 440 and system software.
  • the Sound processing module 440 is independent hardware or software other than the application processor 410, the MailBox may be an interface including hardware.
  • the sound processing module 440 is a separate piece of hardware, such as a coprocessor, microprocessor or logic circuit, for performing different functions with the application processor.
  • FIG. 5 is a schematic diagram of a real-time digital mixing method provided by an embodiment of the present application
  • FIG. 6 is a logic block diagram of an apparatus for implementing the real-time digital mixing method.
  • the method of real-time digital mixing in FIG. 5 will be described below based on the operating system software architecture and corresponding hardware components shown in FIG. 4 and the apparatus shown in FIG.
  • the embodiment of the present application describes the method of real-time digital mixing in the form of steps, although the order of the method is shown in the method flowchart 5, but in some cases, it may be different from here.
  • the sequence of steps performs the steps described. It should be understood that the block diagram of the structure in FIG. 4 and the apparatus in FIG. 6 are not limited to each other.
  • the method for real-time digital mixing includes: Step 501: Acquire at least one frame of audio data through a media play interface.
  • the at least one frame of audio data may be audio data that already exists, including background sounds, music, or accompaniment music, and the like.
  • the media play interface corresponds to the media play interface 610 of the device 600, and the media play interface is specifically an application programming interface API, which is located in the application layer of the application processor 410 in FIG. 4, optionally, the media play interface may specifically be an audio play. API, video player API, webcast platform API or other application API with audio and video playback capabilities.
  • an audio player function icon located on the touch screen 430 is touched, and the function icon is associated with an audio play interface API. At least one frame of audio data is read by calling the audio play APP through the audio play interface.
  • At least one frame of audio data herein may be stored in a memory (eg, the memory 140 in the device 100 or the block diagram in FIG. 4). At least one frame of audio data in the local audio file in the memory 420), optionally, at least one frame of the audio data herein may also be at least one frame of audio data in an audio file buffered or downloaded from the Internet.
  • the audio file is obtained by calling the media player in units of frames, and the audio file is flexibly processed through the playing process.
  • the audio data of any frame of other audio files can be flexibly switched at any time through the media playing interface.
  • the playback process here specifically includes: calling the audio file from the media player and finally playing the entire process of the audio file through a speaker or other audio output device.
  • the method for real-time digital mixing comprises the step 502 of decoding at least one frame of audio data from a compressed state into a raw data form.
  • Decoding at least one frame of audio data of the invoked audio file from the compressed state to the original data form is done in a playback process.
  • the step may be performed by the decoding module 650 of the device 600, optionally, the decoding
  • the module 650 can be the audio codec 450 of FIG. 4; optionally, the decoding module 650 can be an audio codec that is provided by the media player, wherein the audio codec that is provided by the media player can pass through the software.
  • Module or hardware implementation; optionally, the decoding module 650 may be a separate audio codec chip, or may be an audio codec module integrated in a processor chip or a HiFi chip.
  • the audio data of the original data form may be a Pulse Code Modulation (PCM) data stream.
  • PCM Pulse Code Modulation
  • the compressed audio data may be a Windows Media Audio (WMA), an Adaptive Prencdictive Encoding (APE), a Free Lossless Audio Codec (FLAC), or a motion picture.
  • WMA Windows Media Audio
  • APE Adaptive Prencdictive Encoding
  • FLAC Free Lossless Audio Codec
  • MP3 Picture Experts Group Audio Layer III
  • MP3 Picture Experts Group Audio Layer III
  • the audio data of the original data form is a decoding result obtained by decoding the compressed audio data by using a correlation decoding technique.
  • the method of real-time digital mixing includes the step 503 of transmitting at least one frame of audio data to a digital mixing module through a data path in the system software.
  • the data path can be the data path 620 of device 600.
  • the system software herein may be operating system software, such as an Android operating system, a Linux operating system, an iOS system, or other types of operating system software.
  • the data path is a transmission path of the audio data in the operating system.
  • the data path here runs through the entire architecture of the operating system software.
  • the data path may include at least one of the following: a track source node 621, an audio controller 622, An audio hardware abstraction layer 623 or a hardware driver layer 624.
  • the track source node is located in the Framework layer of the operating system software. Specifically, the track source node is an AudioTrack interface in the audio system.
  • the AudioTrack interface is an API interface provided by the Audio system. AudioTrack is the source node for multiple tracks or the starting point for multiple tracks. Audio data with different parameter characteristics is aggregated on this AudioTrack interface.
  • AudioTrack selects different audio tracks for audio data based on the parametric characteristics of the audio data, and the audio track is an audio standard with fixed parameter characteristics. AudioTrack can realize the output of audio data on the operating system platform.
  • the parameter characteristics of the audio data can include sampling rate, bit depth, number of channels, type of audio stream, and the like.
  • At least one frame of audio data flowing from the track source node 621 reaches the audio controller 622, which is located at the Framework layer of the operating system software, as shown in FIG.
  • the audio governor is AudioFlinger, which is the working engine of the Audio system.
  • AudioFlinger manages all the input and output audio streams of the audio system and controls the reading and writing of the underlying hardware devices.
  • the AudioFlinger can adjust the volume of the audio data, or turn off the audio data output stream through the AudioFlinger to prohibit the audio data from reaching the underlying hardware device.
  • the underlying hardware device can be the speaker 162 of the device 100, or can be in FIG. Speaker 470, or may be another audio output device.
  • the audio data output stream can be turned off by the AudioFlinger to prohibit the audio data from reaching the underlying hardware device.
  • At least one frame of audio data passes through the audio governor 622 to the audio hardware abstraction layer 623, as shown in FIG. 4, the audio hardware abstraction layer being located at the HAL layer.
  • the audio hardware abstraction layer is Audio HAL
  • the Audio HAL is a software abstraction of the underlying audio hardware device
  • each of the underlying audio hardware devices has a corresponding software interface in the Audio HAL layer.
  • the corresponding audio hardware device can be controlled through the interface, for example, some audio hardware devices can be enabled or disabled.
  • At least one frame of audio data passes through the audio hardware abstraction layer 623 and then reaches the hardware driver layer Driver 624, which is the direct performer of the control action.
  • Control commands for the underlying hardware devices at the Audio HAL layer are implemented by the Driver.
  • the "drive speaker operation” is set in the Audio HAL layer, and the driver performs the driving of the speaker under the control command of "drive speaker operation".
  • the "disable speaker” can be set in the Audio HAL layer, and the driver implements the control command under the "disable speaker” command. The speaker is disabled.
  • At least one frame of audio data from the application layer of the operating system software architecture to the digital mixing module also needs to pass through a relay communication layer, as shown in FIG. 4, the relay communication layer can be a mailbox communication mechanism, and the system for implementing the application processor The interaction of data or control information between software or application software and sound processing modules.
  • the real-time digital mixing method may include steps 504-506.
  • Step 504 detecting whether there is real-time audio signal input, when detecting that there is real-time audio signal input, performing step 505; when detecting no real-time audio signal input, performing step 506.
  • the real-time audio signal is highlighted, and at least one frame acquired through the media playing interface is weakened. For audio data, you can get a better mixing experience.
  • the step may be performed by the audio detection module 660 of the device 600.
  • the audio detection module 660 may be a software module or a hardware circuit integrated in the processor or the sound processing module, or may be independent. chip.
  • an audio detection module may be added in the sound processing module 440 of FIG. 4. After the digital mixing module is activated through the digital mixing interface, the audio detection module may be detected based on the audio detection module. Audio signal input, optionally, the audio detection module may be a voice activity detection (VAD) module.
  • VAD voice activity detection
  • Step 505 Reduce the volume of the at least one frame of audio data.
  • Step 506 Increase a volume of the at least one frame of audio data.
  • the audio detection module 660 when the audio detection module detects that there is a real-time audio signal input, the audio detection module 660 sends a control signal to the audio controller 622 in the data path 620 to reduce or increase the volume of the at least one frame of audio data.
  • the audio controller 622 is an AudioFlinger, and the AudioFlinger reduces or increases the volume of the at least one frame of audio data after receiving the control signal.
  • the audio detection module 660 sends control information to the digital mixing module 640 to reduce or increase the volume of the at least one frame of audio data.
  • the volume-related variable can be changed in the digital mixing module 660.
  • the volume of the at least one frame of audio data can be reduced by reducing the volume-related variable by increasing the volume-related A variable to increase the volume of the at least one frame of audio data.
  • the digital mixing module may also be the digital mixing module 441 in FIG.
  • the method of real-time digital mixing includes the step 507 of acquiring a real-time audio signal.
  • the real-time audio signal may be a digital audio signal obtained after processing sounds from humans or nature.
  • the real-time audio signal may be acquired by the real-time signal acquisition module 630 of the device 600.
  • the real-time signal acquiring module 630 may be an interface for receiving a real-time audio signal sent by another device, where the real-time audio signal may be a digital signal that has undergone at least one of the following processes: eliminating signal aliasing, Eliminate jitter, eliminate oversampling, noise suppression, echo cancellation, or gain control. It should be understood that there is no time delay between the real-time representation of the sound source and the sound from the sound source, or the existence of a small time delay is negligible.
  • the method for real-time digital mixing further comprises: acquiring a real-time analog audio signal.
  • the device 600 may include an audio input device and an analog-to-digital converter ADC, and the real-time analog audio signal may be acquired by an audio input device. Further, the real-time analog audio signal is converted by the ADC to obtain the real-time audio signal.
  • the audio input device may be a microphone, a sound sensitive device or other device with sound collection function, such as the microphone 161 in FIG. 1 , the microphone 220 in FIG. 2 or the microphone in FIG. 4 . 460; Optionally, the audio input device may be the additional device shown in FIG.
  • the audio input device may be a chip with a sound collection function, or may be An audio codec (Codec) connected to a microphone, sound sensitive device, or other device with sound collection.
  • Codec An audio codec
  • the method of real-time digital mixing includes the step 508 of mixing at least one frame of audio data with a real-time audio signal in a digital mixing module to obtain a mixed audio signal.
  • the at least one frame of audio data may be background music, narration, accompaniment music, etc.
  • the real-time audio signal is derived from human or natural sounds to achieve mixing.
  • the digital mixing module may be the digital mixing module 640 of the device 600; optionally, the digital mixing module may be the digital mixing module 441 in the sound processing module 440 as shown in FIG. 4;
  • the digital mixing module can be a software module, for example, a function; optionally, the digital mixing module can also be implemented by hardware logic; optionally, the digital mixing module can be independent hardware. For example, it can be a coprocessor, a microprocessor or other processor core.
  • a new API interface corresponding to the digital mixing module may be added in an application layer of the audio and video software architecture, and the digital mixing interface is adopted. Implements interaction with internal digital mixing modules.
  • the API interface may be a digital mixing interface
  • the digital mixing interface may be used to send control information to the digital mixing module (as indicated by the dashed arrow in FIG. 4) to implement control of the digital mixing module, or
  • the digital mixing interface can receive audio data from the digital mixing module (as shown by the solid line of the arrow from the digital mixing module to the digital mixing interface as shown in FIG. 4).
  • the audio data can be obtained by real-time mixing.
  • the sound mixing audio signal, specifically, the control information may include digital mixing module enabling information or digital mixing module disable information.
  • the audio source management interface shown in FIG. 4 may be AudioRecord.
  • the AudioRecord manages the audio source and is responsible for collecting audio data on the operating system platform (for example, the Android platform). For example, the audio input hardware of the platform may be used. recording.
  • the real-time audio signal may be subjected to at least one of the following processes: eliminating signal aliasing, eliminating jitter, eliminating oversampling, and suppressing noise, before mixing at least one frame of audio data with the real-time audio signal. Echo cancellation or gain control.
  • the above process may be performed in the pre-processing module 680 of the device 600.
  • the pre-processing module 680 can be implemented by a software module or hardware logic, for example, a software module in HiFi or a piece of hardware logic integrated in HiFi.
  • the pre-processing module can be integrated with the digital mixing module 640. In the chip, it can also be a stand-alone chip.
  • the pre-processing module may be the pre-processing module 442 in FIG. 4 or may be a pre-processing module in the sound processing system 132 of the device 100.
  • the at least one processing may be implemented in an audio codec.
  • the audio codec may be the audio codec 450 in FIG. 4, and the audio codec is also It may be an audio codec module integrated in the audio circuit 160 of the device 100 or an audio codec located in the processing system 130.
  • the analog audio signal is acquired, and the process may further include an analog-to-digital conversion (ADC) to convert the analog audio signal into a digital real-time audio signal.
  • ADC analog-to-digital conversion
  • the sound quality of the acquired real-time audio signal can be improved, the noise contained in the real-time audio signal can be reduced, unnecessary interference can be introduced in the mixing process, and the mixing can be avoided. Audio overflow occurs during the process to avoid distortion of the mix audio.
  • the enabling information can be sent to the digital mixing module by touching a function icon associated with the digital mixing interface on the display screen 430 to activate the digital mixing.
  • a sound module that mixes the real-time audio signal from the audio input device with at least one frame of audio data called by the player to obtain a mixed audio signal.
  • the obtained mixed audio signal can be directly transmitted to the speaker 470 coupled with the audio codec for real-time playback; alternatively, the obtained mixed audio signal can also be passed through the system software.
  • the mid-up data path is passed to the digital mixing interface at the application layer for subsequent operation by the upper application.
  • the subsequent operation may include at least one of: saving the obtained mixed audio signal in the local memory 420, uploading the obtained mixed audio signal to the network, or transmitting the obtained mixed audio signal to the first
  • the three-party media player plays, for example, the third-party media player can be a webcast platform, various music players, video players, and the like.
  • the real-time digital mixing method may include the step 509 of acquiring a video image signal.
  • the video input signal is acquired while the audio input device acquires the real-time audio signal.
  • the video image signal may be acquired by a camera or other device having an image capturing function.
  • the image signal is an image signal of consecutive multiple frames.
  • the video image signal may be obtained from a local memory or a network, and the video image signal may be a video signal formed by multiple frames in time or space or a plurality of non-contiguous pictures.
  • the real-time digital mixing method may include a step 510 of mixing the video image signal with the mixed audio signal to obtain a mixed video signal.
  • the mixed audio signal obtained in step 508 and the video image signal obtained in step 509 may be transmitted to a video processing module for fusion to obtain a mixed video signal.
  • the video processing module may be the device 600.
  • the video processing module 670 may be a software module stored in the memory, or may be implemented by a hardware logic circuit, and the video processing module may also be a separate chip; for example, the video processing module may be a video editing module. decoder.
  • the video processing module may be a software module or a hardware circuit in the media processing system 131 of the device 100, or may be the video mixing stored in the memory 420 in FIG. Software module.
  • the mixed video signal may be transmitted to the video player for playing, or may be shared to multiple Internet users in real time through the network, or the mixed video signal may be stored in the local memory for subsequent playback by the user.
  • the device embodiments provided in the present application are only schematic, and the cell division in FIG. 6 is only a logical function division, and may be further divided in actual implementation.
  • multiple modules may be combined or may be integrated into another system.
  • the coupling of the various modules to one another may be through some interfaces, which are typically electrical communication interfaces, but may not exclude mechanical interfaces or other form interfaces.
  • the modules described as separate components may or may not be physically separate, and may be located in one location or in different locations on the same or different devices.
  • the embodiment of the present application further provides a computer readable storage medium having instructions stored in a computer, when executed on a computer, causing the computer to perform one or more of any one of the above-mentioned real-time digital mixing methods Steps.
  • the various component modules of the above apparatus may be stored in the computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the embodiment of the present application further provides a computer program product including instructions, and the technical solution of the present application may contribute to the prior art or all or part of the technical solution may be a software product.
  • the computer software product is stored in a storage medium and includes instructions for causing a computer device, mobile terminal or processor therein to perform all or part of the steps of the methods described in various embodiments of the present application. Please refer to the related description of the memory 140 for the kind of the storage medium.
  • the sound processing module 440 can be implemented in software.
  • the sound processing module 440 at this time can make an arithmetic unit formed by software running in the application processor 410. That is, the application processor 410 implements the related method flow of the embodiment of the present invention by running software instructions.
  • At least one frame of audio data acquired by the media playing interface is mixed with real-time audio signals originating from the audio input device in real time, and the media playing interface acquires the audio file in units of frames instead of acquiring the entire audio file.
  • used to transmit at least one frame of audio data is the data path in the system software, and does not introduce external environmental noise.
  • at least one frame of audio data of the application layer software is called through the media playing interface, which is simple and flexible.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

一种音频信号处理的方法以及终端,方法将媒体播放接口(610)获取的至少一帧音频数据与源自于音频输入设备的实时音频信号进行实时混合,且媒体播放接口(610)以帧为单位获取音频文件,而不是获取整个音频文件,在混音过程中可以通过媒体播放接口(610)随时灵活切换至其他音频文件的任一帧音频数据,且用于传输至少一帧音频数据的是系统软件中的数据通路,不会引入外部环境噪声。方法还引入了音频检测机制,可以自适应调整参与混音的至少一帧音频数据的音量,提升了混音体验。

Description

一种实时数字音频信号混音的方法及装置 技术领域
本申请涉及通信领域,尤其涉及一种音频信号处理的方法和装置。
背景技术
随着移动通信业务和互联网业务的不断发展,越来越多的用户倾向于使用短语音、小视频以及网络视频直播的方式来记录或分享生活的点滴,其中,网络视频直播更是因为具有直观、快速、交互性强等优点而广受互联网用户的喜爱。
在一些音频录制的场景中,为了得到具有特定效果的音频或视频文件,需要通过混音技术将用户指定的背景音乐混入采集的语音文件中,或者在进行视频录制或网络视频直播时将特定的伴奏或歌曲插入到录制的视频中。现存的混音技术中,无论是模拟混音还是数字混音,在进行混音的过程中都无法对参与混音的背景音频文件进行调节和修改,灵活性差。
发明内容
本申请实施例提供一种音频信号处理的方法和装置,以实现灵活的实时数字混音。
本申请第一方面提供了一种音频信号处理的方法,该方法包括:通过媒体播放接口获取至少一帧音频数据,该媒体播放接口为应用程序编程接口;将该至少一帧音频数据通过系统软件中的数据通路传送到数字混音模块;获取实时音频信号;在数字混音模块中将该至少一帧音频数据与该实时音频信号混合,以得到混音音频信号。
该音频信号处理方法将媒体播放接口获取的至少一帧音频数据与源自于音频输入设备的实时音频信号进行实时混合,且媒体播放接口以帧为单位获取音频文件,而不是获取整个音频文件,在混音过程中可以通过媒体播放接口随时灵活切换至其他音频文件的任一帧音频数据,实现了实时的数字混音。且用于传输至少一帧音频数据的是系统软件中的数据通路,不会引入外部环境噪声。
在一种可能的设计中,在将上述至少一帧音频数据通过系统软件中的数据通路传送到数字混音模块之前,该方法还包括:将该至少一帧音频数据从压缩形态解码成原始数据形态。
可选地,媒体播放接口可以耦合至音乐播放器。在启用所述音乐播放器时,借用音乐播放器的音乐播放流程完成对至少一帧音频数据的所述解码,在任一帧音频数据到达数字混音模块时已经是原始数据形态的音频数据而非压缩的数据,因此在数字混音模块中不再需要对音频数据进行解码,提高了实时混音的效率。
在一种可能的设计中,该系统软件包括操作系统软件。可选地,该系统软件还可包括除应用软件之外的其他驱动软件或平台软件,如开源系统软件、中间件、或微件等。
在一种可能的设计中,该数据通路包括如下至少一项:音轨源节点、音频调控器、音频硬件抽象层或硬件驱动层。
可选地,该数据通路为音频数据在操作系统内部的传送路径,该音轨源节点为多路音频数据流向多个不同音轨的起点;该音频调控器为该系统软件中音频数据处理和音频硬件设备管理的调控器;该音频硬件抽象层为对音频硬件设备抽象得到的软件接口;该硬件驱动层为音频硬件设备的驱动器。
在一种可能的设计中,在将该至少一帧音频数据通过系统软件中的数据通路传送到数字混音模块的过程中,禁止该至少一帧音频数据的播放。
该方法中至少一帧音频数据所借助的播放流程与普通的播放流程不同,该至少一帧音频数据从媒体播放器接口输出后不会外放,而是直接送到数字混音模块,既借助现有播放流程完成至少一帧音频数据的解码又避免外放引入不必要的背景噪声。
在一种可能的设计中,禁止该至少一帧音频数据的播放包括如下至少一项:通过该数据通路中的音频调控器关闭该至少一帧音频数据的音频输出数据流;或基于该数据通路中的音频硬件抽象层控制该数据通路中的硬件驱动层禁用该至少一帧音频数据的音频输出设备。
在一种可能的设计中,在获取实时音频信号之前,该方法还包括:检测是否有该实时音频信号输入;当检测到有该实时音频信号输入时,减小该至少一帧音频数据的音量。
在一种可能的设计中,在获取实时音频信号之前,该方法还包括:获取实时模拟音频信号,并将该实时模拟音频信号转换为上述实时音频信号。
以上设计基于实时音频信号的有无自适应调整参与混音的至少一帧音频数据的音量,可以更突出实时音频信号。
在一种可能的设计中,减小该至少一帧音频数据的音量包括如下至少一项:通过控制上述数据通路中的音频调控器减小该至少一帧音频数据的音量;或通过控制上述数字混音模块减小该至少一帧音频数据的音量。
在一种可能的设计中,在该数字混音模块中将该至少一帧音频数据与该实时音频信号混合之前,该方法还包括:对该实时音频信号进行如下至少一种处理:消除信号混叠、消除抖动、消除过采样、噪声抑制、回声消除或增益控制。
在混音之前对实时音频信号进行上述处理,可以提高获取的实时音频信号的音质,减少实时音频信号中所包含的杂音,避免在混音过程中引入不必要的干扰,还可以避免在混音过程中出现音频溢出,避免混音音频失真。
在一种可能的设计中,在得到混音音频信号之后,该方法还包括:获取视频图像信号;将该视频图像信号和该混音音频信号混合,得到混音视频信号。
在一种可能的设计中,在得到混音音频信号之后,该方法还包括如下至少一项:播放该混音音频信号,将该混音音频信号存储在本地,将该混音音频信号发送给其他装置或将该混音音频信号上传到网络。
在得到混音音频信号后可以实时播放,也可以将该混音音频信号存储以便后续回放,或者实时共享给其他终端用户或者互联网用户。
在一种可能的设计中,该方法还包括:通过数字混音接口激活上述数字混音模块,所述数字混音接口为应用程序编程接口。
本申请第二方面提供了一种音频信号处理的装置,该装置包括:媒体播放接口、位于系统软件中的数据通路、实时信号获取模块和数字混音模块;该媒体播放接口,用于获取至少一帧音频数据,该媒体播放接口为应用程序编程接口;该数据通路,用于将该至少一帧音频数据传送到该数字混音模块;该实时信号获取模块,用于获取实时音频信号;该数字混音模块,用于将该至少一帧音频数据与该实时音频信号混合,以得到混音音频信号。
在一种可能的设计中,该装置还包括:解码模块,用于在该数据通路将该至少一帧 音频数据传送到该数字混音模块之前,将该至少一帧音频数据从压缩形态解码成原始数据形态。
在一种可能的设计中,该系统软件包括操作系统软件。可选地,该系统软件还可包括除应用软件之外的其他驱动软件或平台软件,如开源系统软件、中间件、或微件等。
在一种可能的设计中,该数据通路包括如下至少一项:音轨源节点、音频调控器、音频硬件抽象层或硬件驱动层。
在一种可能的设计中,该数据通路进一步用于:在将该至少一帧音频数据传送到该数字混音模块的过程中,禁止该至少一帧音频数据的播放。
在一种可能的设计中,该数据通路中的音频调控器,用于关闭该至少一帧音频数据的音频输出数据流;或该数据通路中的音频硬件抽象层,用于控制该数据通路中的硬件驱动层禁用该至少一帧音频数据的音频输出设备。
在一种可能的设计中,该装置还包括:音频检测模块,用于:在该实时信号获取模块获取实时音频信号前,检测是否有实时音频信号输入;当检测到有该实时音频信号输入时,控制减小该至少一帧音频数据的音量。
在一种可能的设计中,该音频检测模块用于执行如下至少一项:给该数据通路中的该音频调控器发送该控制信号以减小该至少一帧音频数据的音量,例如发送减小音量的控制信号给所述音频调控器;或给该数字混音模块发送该控制信号以减小该至少一帧音频数据的音量,例如发送减小音量的控制信号给所述数字混音模块。
在一种可能的设计中,该装置还包括预处理模块,用于:对该实时音频信号进行如下至少一种处理:消除信号混叠、消除抖动、消除过采样、噪声抑制、回声消除或增益控制。
在一种可能的设计中,该装置还包括:视频处理模块,用于:获取视频图像信号;将该视频图像信号和该混音音频信号混合,得到混音视频信号。
在一种可能的设计中,该装置还包括如下至少一个模块:播放模块,用于播放该混音音频信号;存储模块,用于存储该混音音频信号;发送模块,用于将该混音音频信号发送给其他装置;或上传模块,用于将该混音音频信号上传到网络。
在一种可能的设计中,该装置还包括:数字混音接口,用于接收使能信息并转发该使能信息以激活上述数字混音模块,该数字混音接口为应用程序编程接口。
本申请第三方面提供了一种音频信号处理的装置,该装置包括:处理器和音频处理器;该处理器被配置为读取存储器中存储的软件指令,执行该软件指令以实现如下操作:通过媒体播放接口获取至少一帧音频数据,该媒体播放接口为应用程序编程接口;将该至少一帧音频数据通过系统软件中的数据通路传送到该音频处理器;该音频处理器,用于:获取实时音频信号;将该至少一帧音频数据与该实时音频信号混合,以得到混音音频信号。
在一种可能的设计中,该装置还包括所述存储器。
在一种可能的设计中,该装置还包括:解码器,用于在该数据通路将至少一帧音频数据传送到该音频处理器之前,将至少一帧音频数据从压缩形态解码成原始数据形态。
在一种可能的设计中,该系统软件包括操作系统软件。可选地,该系统软件还可包括除应用软件之外的其他驱动软件或平台软件,如开源系统软件、中间件、或微件等。
在一种可能的设计中,该数据通路包括如下至少一项:音轨源节点、音频调控器、 音频硬件抽象层或硬件驱动层。
在一种可能的设计中,该处理器被配置为执行该软件指令以进一步实现如下操作:在将该至少一帧音频数据传送到该数字混音模块的过程中,禁止该至少一帧音频数据的播放。
在一种可能的设计中,该处理器被配置为执行该软件指令以进一步实现如下操作:通过该音频调控器关闭该至少一帧音频数据的音频输出数据流;或通过该音频硬件抽象层控制该数据通路中的硬件驱动层禁用该至少一帧音频数据的音频输出设备。
在一种可能的设计中,该音频处理器还用于:在获取实时音频信号前,检测是否有实时音频信号输入;当检测到有该实时音频信号输入时,控制减小该至少一帧音频数据的音量。
在一种可能的设计中,该音频处理器具体用于:给该数据通路中的该音频调控器发送该控制信号以减小该至少一帧音频数据的音量,例如发送减小音量的控制信号给所述音频调控器;或给该音频处理器中的数字混音模块发送该控制信号以减小该至少一帧音频数据的音量。
在一种可能的设计中,该音频处理器进一步用于:对该实时音频信号进行如下至少一种处理:消除信号混叠、消除抖动、消除过采样、噪声抑制、回声消除或增益控制。
在一种可能的设计中,该处理器被配置为执行该软件指令以进一步实现如下操作:获取视频图像信号;将该视频图像信号和该混音音频信号混合,得到混音视频信号。
在一种可能的设计中,该处理器被配置为执行该软件指令以进一步实现如下操作:播放该混音音频信号,将该混音音频信号存储在该存储器中,将该混音音频信号通过传输接口发送给其他装置或将该混音音频信号通过网络接口上传到网络。该网络接口可以是无线收发机、射频(Radio Frequency,RF)电路或有线接口等。该传输接口可以是输入/输出接口。
本申请第四方面提供了一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当其在计算机或处理器上运行时,使得所述计算机或处理器执行如上述第一方面或者其任一种可能的设计中所述的方法。
本申请第五方面提供了一种包含指令的计算机程序产品,当其在计算机或处理器上运行时,使得所述计算机或处理器执行如上述第一方面或者其任一种可能的设计中所述的方法。
本申请第六方面提供了一种装置,该装置包括处理器,用于读取存储器中的软件指令,并执行该软件指令以实现执行如上述第一方面或者其任一种可能的设计中所述的方法。
可选地,该装置还包括所述存储器,用于存储所述软件指令。
附图说明
图1为本申请实施例提供的一种装置的结构示意图;
图2为本申请实施例提供的一种外围设备的结构示意图;
图3为本申请实施例提供的一种混音技术的应用场景示意图;
图4为本申请实施例提供的一种包括系统软件具体架构和对应的硬件组件在内的结构框图;
图5为本申请实施例提供的一种实时数字混音的方法流程图;
图6为本申请实施例提供的另一种装置的结构示意图;
具体实施方式
本申请的说明书实施例和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元。方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
图1为本申请实施例提供的一种装置100的示意图,该装置100可以包括天线系统110,该天线系统110可以是一个或多个天线,该天线系统110还可以是由多个天线组成的天线阵列。装置100还可以包括射频(Radio Frequency,RF)电路120,该射频电路120可以包括一个或多个模拟射频收发器,该射频电路120还可以包括一个或多个数字射频收发器,该RF电路120耦合到天线系统110。应当理解,本申请的各个实施例中,耦合是指通过特定方式的相互联系,包括直接相连或者通过其他设备间接相连,例如可以通过各类接口、传输线、总线等相连。该射频电路120可用于各类蜂窝无线通信。
装置100还可以包括处理系统130,处理系统130可包括通信处理器,该通信处理器可用来控制RF电路120通过天线系统110实现信号的接收和发送,该信号可以是语音信号、媒体信号或控制信号。该处理系统130中的通信处理器还可用于管理上述信号,具体的,这里的信号管理可以包括信号增强、信号滤波、编解码、信号调制、信号混合、信号分离或者其他各种已知的信号处理过程,以及将来可能出现的新的信号处理过程。该处理系统130可以包括各种通用处理设备,例如可以是通用中央处理器(Central Processing Unit,CPU)、片上系统(System on Chip,SOC)、集成在SOC上的处理器、单独的处理器芯片或控制器等;该处理系统130还可以包括专用处理设备,例如专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或数字信号处理器(Digital Signal Processor,DSP)等。该处理系统130可以是多个处理器构成的处理器组,多个处理器之间通过一个或多个总线彼此耦合。该处理系统可以包括模拟-数字转换器(Analog-to-Digital Converter,ADC)、数字-模拟转换器(Digital-to-Analog Converter,DAC)以实现装置不同部件之间信号的连接,例如可以将麦克风采集的模拟语音信号通过ADC转换成数字语音信号并传送给数字信号处理器处理,或者将处理器中的数字语音信号通过DAC转换成模拟语音信号并通过扬声器播放。该处理系统可以包括媒体处理系统131,该媒体处理系统用于实现图像、音频和视频等媒体信号的处理,具体的,媒体处理系统131可以包括声音处理系统132,具体的,该声音处理系统132可以是通用或专用的声音处理设备,例如可以是集成在SOC上的音频处理子系统,也可以是集成在处理器芯片上的声音处理模块,可选的,该声音处理模块可以是软件模块或硬件模块,该声音处理模块也可以是独立存在的声音处理芯片,该声音处理系统132用于实现音频信号的相关处理。
该装置100还可以包括存储器140,该存储器140耦合到处理系统130,具体的,该存储器140可以通过一个或多个存储器控制器耦合到处理系统130。存储器140可以用于存储计算机程序指令,包括计算机操作系统(Operation System,OS)和各种用户应用程 序,例如可以是具有混音功能的音频处理程序、具有直播功能的应用程序、视频播放器、音乐播放器以及其他可能的应用程序;存储器140还可以用于存储用户数据,例如日历信息、联系人信息、获取的图像信息、音频信息或其他媒体文件等。处理系统130可以从存储器140读取计算机程序指令或用户数据,或者向存储器140存入计算机程序指令或用户数据,以实现相关的处理功能。例如,可以通过处理系统读取存储在存储器140中的音频文件并调用音乐播放器播放,或者将存储器140中的音频文件读到处理器中进行解码、混音、编码等一系列操作。该存储器140可以是非掉电易失性存储器,例如是EMMC(Embedded Multi Media Card,嵌入式多媒体卡)、UFS(Universal Flash Storage,通用闪存存储)或只读存储器(Read-Only Memory,ROM),或者是可存储静态信息和指令的其他类型的静态存储设备,还可以是掉电易失性存储器(volatile memory),例如随机存取存储器(Random Access Memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的程序代码并能够由计算机存取的任何其他计算机可读存储介质,但不限于此。存储器140可以是独立存在,存储器140也可以和处理系统集成在一起。
装置100还可以包括无线收发器150,该无线收发器150可以向其他设备提供无线连接能力,其他设备可以是无线耳麦、蓝牙耳机、无线鼠标、无线键盘等外围设备,也可以是无线网络,例如无线保真(Wireless Fidelity,WiFi)网络、无线个人局域网络(Wireless Personal Area Network,WPAN)或者其他无线局域网络(Wireless Local Area Network,WLAN)等。无线收发器150可以是蓝牙兼容的收发器,用于将处理系统130以无线方式耦合到蓝牙耳机、无线鼠标等外围设备,该无线收发器150也可以是WiFi兼容的收发器,用于将处理系统130以无线方式耦合到无线网络或其他设备。
装置100还可以包括音频电路160,该音频电路160与处理系统130耦合。该音频电路160可以包括麦克风161和扬声器162,麦克风161可以从外界接收声音输入,该声音输入可以是用户语音输入、外放的音乐输入、噪声输入或者外界的其他形式的声音,该麦克风161可以是集成在装置100上的内置麦克风,也可以是通过接口与装置100耦合的外置麦克风,例如可以是通过耳机接口与装置耦合的耳机麦克风;扬声器162可以实现音频数据的播放,该音频数据可以来自麦克风、也可以是存储在存储器中的音乐文件或经过处理系统处理的音频文件,其中扬声器是音频换能器的一种,可以实现音频信号的增强,该扬声器也可以替换成其他形式的音频换能器。应当理解,该装置100可以有一个或多个麦克风、一个或多个耳机,本申请实施例对麦克风和耳机的数量不做限定。处理系统130通过音频控制器(图1中未画出)驱动或控制音频电路,具体的,根据处理系统130的指令使能或禁用麦克风或扬声器中的至少一项,当需要接受来自麦克风的音频信号时,处理系统通过相关控制指令使能麦克风,并接收麦克风输入的音频信号,该音频信号可以在处理系统130中进行处理,也可以送往存储器140中存储,或者由扬声器播放,还可以通过RF电路120经由天线系统110传送给网络或者其他装置,或者通过无线收发器150传送给网络或者其他装置;当需要播放音频文件时,处理系统通过相关控制指令使能扬声器以实现音频信号的播放。对应的,当不需要麦克风和扬声器时, 通过相关控制指令禁用麦克风和扬声器。
装置100还可以包括显示屏170,用于显示由用户输入的信息,提供给用户的信息的各种菜单,这些菜单与内部的具体模块或功能相关联,显示屏170还可以接受用户输入,例如接受使能或禁用等控制信息。具体的,显示屏170可以包括显示面板171和触控面板172。其中,显示面板171可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED)、发光二级管(Light Emitting Diode,LED)显示设备或阴极射线管(Cathode Ray Tube,CRT)等来配置显示面板。触控面板172,也称为触摸屏、触敏屏等,可收集用户在其上或附近的接触或者非接触操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板172上或在触控面板172附近的操作,也可以包括体感操作;该操作包括单点控制操作、多点控制操作等操作类型。),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板172可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成处理系统130能够处理的信息,再送给处理系统130,并能接收处理系统130发来的命令并加以执行。进一步的,触控面板172可覆盖显示面板171,用户可以根据显示面板171显示的内容(该显示内容包括但不限于,软键盘、虚拟鼠标、虚拟按键、图标等等),在显示面板171上覆盖的触控面板172上或者附近进行操作,触控面板172检测到在其上或附近的操作后,通过I/O子系统10传送给处理系统130以确定用户输入,随后处理系统130根据用户输入通过I/O子系统10在显示面板171上提供相应的视觉输出。虽然在图1中,触控面板172与显示面板171是作为两个独立的部件来实现装置100的输入和输入功能,但是在某些实施例中,可以将触控面板172与显示面板171集成而实现装置100的输入和输出功能。下面结合数字混音操作进行说明,用户触摸显示屏上与数字混音模块相关联的使能按钮,触摸检测装置检测到该触摸操作带来的使能信号,并将该使能信号传送给触摸控制器,触摸控制器将该使能信息转换成处理器可以处理的信息并通过I/O子系统10中的显示控制器13传送给处理系统中的处理器,处理器在接收到使能信息后激活数字混音模块并完成数字混音操作,将处理得到的混音音频送到播放器播放,并通过I/O子系统10中的显示控制器13将混音音频播放的相关信息显示在显示屏上,该相关信息可以包括播放时间、实时歌词等信息。
装置100还可以包括与处理系统130耦合的一个或多个传感器180,该传感器180可以包括图像传感器、运动传感器、接近度传感器、环境噪声传感器、声音传感器、加速度计、温度传感器、陀螺仪或者其他类型的传感器,以及它们的各种形式的组合。处理系统130通过I/O子系统10中的传感器控制器12驱动传感器180接收音频信号、图像信号、运动信息等各种信息,传感器180将接收的信息传到处理系统130中进行处理。
装置100还可以包括其他输入设备190,其耦合到处理系统130以接收各种用户输入,例如接收输入的号码、姓名、地址以及媒体选择等,该媒体可以是音乐或其他音频文件、各种视频格式的视频文件、静态图片以及动态图片等,其他输入设备190可以包括键盘、物理按钮(按压按钮、摇臂按钮等)、拨号盘、滑动开关、操纵杆、点击滚轮、光鼠(光鼠是不显示可视输出的触摸敏感表面,或者是由触摸屏形成的触摸敏感表面的延伸)等。
装置100还可以包括上述的I/O子系统10,该I/O子系统10可以包括其他输入设备控制器11用于从其他输入设备190接收信号或者向其他输入设备190发送处理系统130 的控制或驱动信息,I/O子系统10还可以包括上述的传感器控制器12和显示器控制器13,分别用于实现传感器180和显示屏170与处理系统130之间的数据和控制信息的交换。
装置100还可以包括电源101,以向装置100的包括110-190在内的其他部件供电,该电源可以是可充电的或不可充电的锂离子电池或镍氢电池。进一步的,当电源101是可充电的电池时,可以通过电源管理系统与处理系统130耦合,从而通过电源管理系统实现管理充电、放电、以及功耗调整等功能。
尽管未示出,装置100还可以包括摄像头,用于根据装置100的工作模式获取单帧的图像或者连续多帧的视频图像,并将获取的图像信息传送给处理系统130进行处理,具体的,该处理系统130中可以集成图像处理单元,或者包括单独的图像处理器或图像处理芯片;可选的,该处理系统130中还可以包括视频编解码Video Codec模块可用于将图像信号和音频信号进行融合得到视频信号,当装置工作在视频拍摄模式时,将摄像头获取的连续多帧的视频图像与音频电路160获取的音频信号在Video Codec中融合得到有声音的视频信号。当需要录制混音视频时,将摄像头获取的连续多帧的视频图像与数字混音模块获得的混音音频信号在Video Codec中融合得到混音视频信号。可选的,该Video Codec可以是集成在处理系统130中的一个处理模块,也可以是一个单独存在的视频编解码芯片。
应当理解,图1中的装置100仅仅是一种示例,对装置100的具体形态不构成限定,装置100还可以包括图1中未显示出来的现有的或者将来可能增加的其他组成部分。
在一种可选的方案中,RF电路120、处理系统130和存储器140可以部分或全部集成在一个芯片上,也可以是三个彼此独立的芯片。RF电路120、处理系统130和存储器140可以包括布置在印刷电路板(Printed Circuit Board,PCB)上的一个或多个集成电路。
图2是本申请实施例提供的一种外围设备200的一种示例,该外围设备200与装置100间进行数据交换和信号交互,该外围设备可以包括处理器210、麦克风220、音频换能器230、无线收发器240以及一个或多个传感器250。
其中,处理器210与麦克风220和音频换能器230相耦合,处理器210可驱动或禁用麦克风220以及音频换能器230,处理器210可以接受来自麦克风220的音频数据或者向音频换能器230传送音频数据,具体的,处理器210可以耦合到音频编解码器以实现与麦克风220以及音频换能器230之间的信号交换,具体的,音频换能器230可以是扬声器。
无线收发器240也与处理器210相耦合,无线收发器240具体请参照图1中无线收发器150的描述,外围设备200通过无线收发器240与装置100或具有类似功能的其他设备之间进行无线连接,例如外围设备200可以在处理器210的控制下通过无线收发器240向装置100传送信号或者接受来自装置100的信号。
处理器210耦合到一个或多个传感器250,该传感器250可用于检测用户活动、各种环境信息等数据信息,传感器250同装置100中的传感器180部分或全部相同,传感器250将检测到的数据传到处理器210中进行处理。
图2中示出的外围设备200可以是无线耳麦,该外围设备200可以通过蓝牙接口与装置100交换数据,应当理解,虽然图2示出了某一种特定形态的外围设备,但是这并不是对外围设备形态的限定,在一些可选的方案中,该外围设备200还可以是有线耳麦、 有线键盘、无线键盘、有线或无线的光标控制设备以及其他的有线或无线的输入或输出设备。
装置100和外围设备200构成的系统可以是多种类型的数据处理系统,例如该系统可以是音频数据处理系统、图像数据处理系统、温度数据处理系统或者其他数据处理系统。
图3为本申请实施例提供的一种混音技术的应用场景示意图,其中310为一种包含混音模块的终端,该终端310可以是图1所示的装置100的一种具体形态,终端310包含装置100的全部或部分结构;311为麦克风,该麦克风可以是终端310自带的麦克风,也可以是独立的麦克风并作为一种外围设备200通过有线通信或无线通信的方式与终端310实现数据交换,例如该麦克风可以是无线耳机的麦克风、也可以是有线耳机上的麦克风、或者可以是具有声音采集功能的声敏装置或外置声卡等。
在该应用场景中,用户可以实现情景录音,将自己分享的语音内容与特定的背景音乐融合在一起,从而提升分享内容的趣味性,可选的,用户分享的语音内容可以是歌曲演唱、脱口秀、语音授课或者其他形式的语音分享。终端310将麦克风采集的用户语音和预先设定的背景音乐混合得到混音音频,可选的,用户可以通过扬声器312对得到的混音音频进行本地实时播放;可选的,用户还可以将得到的混音音频数据基于无线通信技术上行传输到网络320,并通过网络将该混音音频数据共享给诸多互联网用户。互联网用户可以通过各种具有无线通信功能的设备从网络320获取该混音音频,如图3中所示,听众1通过智能手机330获取网络下行传输的该混音音频数据;示例性的,听众2通过与网络相连的台式电脑340获取网络下行传输的该混音音频数据;示例性的,听众3通过笔记本电脑350获取网络下行传输的该混音音频数据。应当理解,可以有更多数量的听众同时获取用户共享到网络上的混音音频数据,图3中仅画出3个听众作为示例。应当理解,互联网用户使用的设备还可以是具有无线通信功能的音频播放器、车载设备、可穿戴设备、平板电脑、蓝牙耳机、收音机等。可选的,用户在得到混音音频数据后可以存储在本地终端的存储设备中,例如该存储设备可以是装置100的存储器140,以便用户在有需要的时候回放或共享该存储的混音音频数据。
可选的,在该应用场景中,用户可以实现网络视频直播,视频直播内容可以是舞蹈表演、工艺品展示和讲解、在线礼仪授课以及其他可以通过视频展示的内容,用户可依据视频内容风格混入不同的背景音乐以提升视频内容的趣味性和娱乐性。终端310通过摄像头(图3中未示出)获取用户的图像信息,同时通过麦克风311获取用户的语音信息,在终端310的混音模块中将麦克风采集的用户语音与用户指定的背景音乐混合得到混音音频,进一步的,在终端310的视频编解码Video Codec模块(图3中未示出)中将摄像头获取的图像信息与得到的混音音频信息融合得到混音视频数据。用户将得到的混音视频数据基于无线通信技术上行传输到网络320,互联网用户可以通过各种具有无线通信功能并且具有视频播放功能的设备从网络320获取该混音视频并进行观看。可选的,用户可以通过本地终端310观看混音视频的效果;可选的,用户可以将录制的混音视频保存在终端310的存储设备内以便后续播放和共享。
应当理解,上述的应用场景中使用的无线通信技术可以是各种可提供如语音通话、视频、数据、广播或其他各种蜂窝无线通信服务的技术,例如可以是全球移动通信(Global System for Mobile Communication,GSM)技术,码多分址(Code Division Multiple Access, CDMA)技术,宽带码分多址(Wideband Code Division Multiple Access,WCDMA)技术,长期演进(Long Term Evolution,LTE)技术、未来第五代(5th Generation,5G)移动通信技术或者未来演进的公共陆地移动网络(Public Land Mobile Network,PLMN)技术,本申请实施例对蜂窝无线通信技术的类型不做限定。
应当理解,上述场景中对获取用户指定的背景音乐的方法不做限定,可选的,用户指定的背景音乐可以外放到环境中,同用户语音一样通过麦克风进入终端310的混音模块。可选的,可以直接从终端的本地存储器内读取用户指定的背景音乐而无需经过外放。可选的,还可以从网络获取缓存的背景音乐。
图4为本申请实施例提供的一种包括系统软件具体架构和对应的硬件组件在内的结构框图,该系统软件具体架构和对应的硬件组件可实现实时数字混音方法。如图所示,器件410为处理系统130中的应用处理器(Application Processor,AP),或是应用芯片,应用处理器用于运行系统软件,例如,应用处理器用于运行操作系统软件,例如,该操作系统软件可以是安卓Android系统、iOS系统或Linux系统中的至少一项;系统软件还可包括除应用软件(也叫应用程序)之外的其他驱动软件或平台软件,如开源系统软件、中间件、或微件等。可选的,应用处理器还可用于运行应用程序相关的代码,该应用程序可以是网络直播平台、音频播放器、视频播放器、美颜相机或具有通信能力的应用等;可选的,应用处理器支持应用程序扩展;可选的,应用处理器可用于运行用户界面相关的代码。存储器420可以是存储器140,可用于存储运行在应用处理器中的操作系统软件、应用程序软件、用户界面相关的软件或其他可以被应用处理器运行的其他计算机程序代码,存储器还可以存储本地音频文件、本地视频文件、用户电话本或其他用户数据。触摸屏430可以是显示屏170,触摸屏430可用于显示装置的各种菜单和功能图标,这些菜单和功能图标与装置内部的具体模块或功能相关联,例如,触摸屏上有与音频播放器APP相关联的图标、与视频播放器APP相关联的图标、美颜相机APP相关联的图标、网络直播平台APP或与其他应用程序相关联的图标,用户通过触摸位于触摸屏上的相关图标运行相应的模块。声音处理模块440可用于实现音频相关的处理,例如可以用于实现多个音频信号的实时混音的独立硬件处理器,该声音处理模块440具体可以是高保真(High Fidelity,HiFi)器件,该声音处理模块440可以是独立的声音处理芯片,也可以是集成在应用处理器中的一个处理模块,或者是集成在应用处理器之外的处理器芯片中的一个处理模块,或者,该声音处理模块440可以是应用处理器运行的软件实现,本实施例对此不限定。例如,该声音处理模块440可以用软件实现,此时其与应用处理器所形成的应用层或应用框架层等类似,是应用处理器运行驱动软件的指令所形成的处理单元。可选的,声音处理模块440中包含数字混音模块441,用于实现实时数字混音算法;可选的,声音处理模块440中还包含预处理模块442,用于对音频信号进行声学处理,可选的,预处理模块442可进行的处理包括消除信号混叠、消除抖动、消除过采样、噪声抑制、回声消除、增益控制或其他声学处理算法。音频编解码器450用于实现音频信号的模数转换或数模转换中的至少一项,例如,音频编解码器450可以将声音处理模块处理的音频数据转换为模拟信号并通过扬声器470、有线耳机或蓝牙耳机(图4中未示出)进行播放,可选的,音频编解码器450可以将麦克风460或其他音频输入设备采集的模拟音频信号转换成数字音频信号并传给声音处理模块或其他处理器进行各种音频处理;音频编解码450可以是单独的音频编解码芯片,也可以是集成在处理器芯片或HiFi芯片中的音频编 解码模块。
可选的,应用处理器410、声音处理模块440以及音频编解码器450可共同构成图1中的声音处理系统132,用于实现各种形式的音频信号处理过程。
上述已经提到,应用处理器410可用于操作系统软件,图4中示出了音视频架构的操作系统软件的一种具体的应用程序框架,例如,该应用程序框架为Android应用程序框架。如图4所示,该应用程序框架包括:
应用(Application,APP)层,APP层位于整个音视频软件架构的最上层。可选的,该层基于Java结构实现。APP层包含音频相关应用程序编程接口(Application Programming Interface,API)、视频相关API或者其他种类的API,API与内部特定的应用程序APP绑定,通过API接口发送控制参数调用对应的APP或接收APP的返回值。
应用框架Framework层,Framework层为整个音视频软件架构的逻辑调度层。其是整个音视频软件架构的策略控制中心,能够对整个音视频处理过程进行调度和策略分配。该层中也包含一些API接口,用于实现音视频数据流处理以及音视频硬件设备的控制等,可选的,该层的核心架构由Java、C++或C中的至少一项构成。
硬件抽象层(Hardware Abstraction Layer,HAL),HAL层为音视频架构操作系统软件与音视频硬件设备的接口层。其为上层软件和下层硬件之间的交互提供接口。HAL层将底层硬件抽象为包含相应硬件接口的软件,通过访问HAL层就可以实现对底层硬件设备的设置,例如可以在HAL层使能或禁用相关硬件设备,可选的,HAL层的核心架构由C++或C中的至少一项构成。
核心Kernel层,Kernel层中包含硬件驱动层、用于根据硬件抽象层输入的控制信息实现对底层硬件设备的直接控制,例如对硬件设备进行驱动或禁用,可选的,Kernel层的核心架构由C++或C中的至少一项构成。
应用处理器410与声音处理模块440之间通过中转通信层实现数据和控制信息的交互,具体的,该中转通信层可以为邮箱(MailBox)通信机制,实现应用处理器410的系统软件或应用软件与声音处理模块440之间的交互。当声音处理模块440由应用处理器410运行的软件指令形成时,MailBox为声音处理模块440和系统软件的接口。当声音处理模块440是应用处理器410之外的独立硬件或软件,则MailBox可以是包括硬件的接口。在一种典型地情况中,声音处理模块440是一个独立的硬件,如协处理器、微处理器或逻辑电路,用于和应用处理器执行不同的功能。
如图5所示为本申请实施例提供的一种实时数字混音的方法,图6为实现该实时数字混音方法的一种装置的逻辑框图。下面基于图4示出的操作系统软件架构和对应的硬件组件以及图6示出的装置对图5中的实时数字混音的方法进行说明。为了便于理解,本申请实施例以步骤的形式对实时数字混音的方法进行描述,虽然在方法流程图5中示出了该方法的顺序,但是在某些情况下,可以以不同于此处的顺序执行所描述的步骤。应当理解,图4中的结构框图与图6中的装置彼此之间并不存在限定。
该实时数字混音的方法包括:步骤501、通过媒体播放接口获取至少一帧音频数据。该至少一帧音频数据可以是已经存在的音频数据,其包括背景音、音乐、或伴奏乐等。
该媒体播放接口对应装置600的媒体播放接口610,该媒体播放接口具体为应用程序编程接口API,位于图4中应用处理器410的应用层,可选的,该媒体播放接口具体可以为音频播放器API,视频播放器API、网络直播平台API或其他具有音视频播放功能 的应用程序API。以图4示出的结构为例,在一种可选的方案中,当需要进行实时混音时,触摸位于触摸屏430上的音频播放器功能图标,该功能图标与音频播放接口API相关联,以实现通过音频播放接口调用音频播放APP读取至少一帧音频数据,可选的,这里的至少一帧音频数据可以是保存在存储器(例如装置100中存储器140或图4中的结构框图中的存储器420)中的本地音频文件中的至少一帧音频数据,可选的,这里的至少一帧音频数据还可以是从互联网缓存或下载的音频文件中的至少一帧音频数据。
在上述的方法中,通过调用媒体播放器以帧为单位获取音频文件,并通过播放流程对音频文件进行灵活处理,例如可以通过媒体播放接口随时灵活切换至其他音频文件的任一帧音频数据,或者在播放的流程中对至少一帧音频数据的音量进行调节。这里的播放流程具体包括:从媒体播放器调用音频文件并最终通过扬声器或其他音频输出设备播放该音频文件的整个流程。
可选的,该实时数字混音的方法包括步骤502,将至少一帧音频数据从压缩态解码成原始数据形态。
将调用的音频文件的至少一帧音频数据从压缩态解码成原始数据形态是在播放的流程中完成的,可选的,该步骤可以由装置600的解码模块650完成,可选的,该解码模块650可以是图4中的音频编解码器450;可选的,该解码模块650可以是媒体播放器自带的音频编解码器,其中,媒体播放器自带的音频编解码器可以通过软件模块或硬件实现;可选的,解码模块650可以是单独的音频编解码芯片,也可以是集成在处理器芯片或HiFi芯片中的音频编解码模块。可选的,原始数据形态的音频数据可以是脉冲编码调制(Pulse Code Modulation,PCM)数据流。可选地,压缩态的音频数据可以是使用微软音频格式(Windows Media Audio,WMA)、自适应预测编码(Adaptive Prencdictive Encoding,APE)、自由无损音频编码(Free Lossless Audio Codec,FLAC)、动态影像专家压缩标准音频层面3(Moving Picture Experts Group Audio Layer III,MP3)、有损压缩、无损压缩等技术压缩后的数据。原始数据形态的音频数据对采用相关解码技术对压缩态的音频数据进行解码得到的解码结果。
该实时数字混音的方法包括步骤503,将至少一帧音频数据通过系统软件中的数据通路传送到数字混音模块。可选的,该数据通路可以是装置600的数据通路620。可选的,这里的系统软件可以是操作系统软件,例如可以是Android操作系统、Linux操作系统、iOS系统或其他类型的操作系统软件。该数据通路为音频数据在操作系统内部的传送路径,这里的数据通路贯穿操作系统软件的整个架构,具体的,该数据通路可以包括如下至少一项:音轨源节点621、音频调控器622、音频硬件抽象层623或硬件驱动层624。
在获得音频文件的至少一帧音频数据后,将该至少一帧音频数据从应用层的媒体播放接口MediaPlayer传送至应用框架层的音轨源节点,可选的,到达音轨源节点的至少一帧音频数据为原始数据形态的音频数据,例如该至少一帧音频数据可以为PCM数据流。如图4所示,该音轨源节点位于操作系统软件的Framework层,具体的,该音轨源节点为音频Audio系统中的AudioTrack接口。AudioTrack接口为Audio系统对外提供的一个API接口。AudioTrack为多个音轨的源节点或者可以称为多个音轨的起点。具有不同参数特性的音频数据均在此AudioTrack接口汇聚。AudioTrack根据音频数据的参数特性为音频数据选择不同的音轨,音轨为具有固定参数特性的音频标准。AudioTrack可以实现操作系统平台上音频数据的输出,可选的,音频数据的参数特性可以包括采样率、位深、 声道数、音频流的类型等。
从音轨源节点621流入的至少一帧音频数据到达音频调控器622,如图4所示,该音频调控器位于操作系统软件的Framework层。具体的,该音频调控器为AudioFlinger,是Audio系统的工作引擎。AudioFlinger管理着音频系统所有的输入输出音频流,并可以控制对底层硬件设备的读写。例如,AudioFlinger可以对音频数据的音量进行调节,或者通过AudioFlinger关闭音频数据输出流禁止音频数据到达底层硬件设备,可选的,该底层硬件设备可以为装置100的扬声器162,或者可以是图4中的扬声器470,或者可以是其他音频输出设备。在一种可选的方案中,当不希望走播放流程的音频数据通过音频输出设备播放到外界,可以通过AudioFlinger关闭音频数据输出流禁止音频数据到达底层硬件设备。
至少一帧音频数据通过音频调控器622之后到达音频硬件抽象层623,如图4所示,该音频硬件抽象层位于HAL层。具体的,该音频硬件抽象层为Audio HAL,Audio HAL是对底层音频硬件设备的软件抽象,每个底层音频硬件设备在Audio HAL层中都有对应的软件接口。通过该接口可以控制对应的音频硬件设备,例如可以使能或禁用某些音频硬件设备。
至少一帧音频数据通过音频硬件抽象层623之后到达硬件驱动层Driver 624,Driver是控制动作的直接执行者。在Audio HAL层对底层硬件设备的控制命令都是通过Driver实现的,例如,Audio HAL层中设置了“驱动扬声器工作”,Driver在该“驱动扬声器工作”的控制命令下执行对扬声器的驱动。在一种可选的方案中,当不希望走播放流程的音频数据通过音频输出设备播放到外界,可以在Audio HAL层设置“禁用扬声器”,Driver在该“禁用扬声器”的控制命令下实现对扬声器的禁用。
至少一帧音频数据从操作系统软件架构的应用层达到数字混音模块还需要通过一个中转通信层,如图4所示,该中转通信层可以为邮箱通信机制,用于实现应用处理器的系统软件或应用软件与声音处理模块之间的数据或控制信息的交互。
可选的,该实时数字混音方法可以包括步骤504-步骤506。其中,步骤504:检测是否有实时音频信号输入,当检测到有实时音频信号输入时,执行步骤505;当检测到无实时音频信号输入时,执行步骤506。
在将两个音频信号进行混音时,通常会希望突出其中一个音频信号,在本申请实施例所提供的实时数字混音方法中,突出实时音频信号,减弱通过媒体播放接口获取的至少一帧音频数据时,可以得到更好的混音体验。
在一种可选的方案中,该步骤可以由装置600的音频检测模块660完成,该音频检测模块660可以是集成在处理器或声音处理模块中的软件模块或硬件电路,也可以是独立的芯片。在一种可选的方案中,可以在图4的声音处理模块440中添加音频检测模块,当通过数字混音接口激活上述数字混音模块之后,可以基于音频检测模块检测音频输入设备是否有实时音频信号输入,可选的,该音频检测模块可以是语音活动检测(Voice Activity Detection,VAD)模块。
步骤505:减少所述至少一帧音频数据的音量。
步骤506、增大所述至少一帧音频数据的音量。
具体的,当音频检测模块检测到有实时音频信号输入时,音频检测模块660向数据通路620中的音频调控器622发送控制信号以减小或增大该至少一帧音频数据的音量。 可选的,该音频调控器622为AudioFlinger,AudioFlinger在接收到控制信号后减小或增大该至少一帧音频数据的音量。
在一种可选的方案中,音频检测模块660向数字混音模块640发送控制信息以减小或增大该至少一帧音频数据的音量。可选的,可以在数字混音模块660中通过改变与音量相关的变量,例如,可以通过减小与音量相关的变量来减小该至少一帧音频数据的音量,通过增大与音量相关的变量来增大该至少一帧音频数据的音量。可选的,该数字混音模块也可以是图4中的数字混音模块441。
该实时数字混音的方法包括步骤507,获取实时音频信号。该实时音频信号可以是对来自于人类或自然界的声音进行处理之后得到的数字音频信号。
可选的,可以通过装置600的实时信号获取模块630获取该实时音频信号。可选的,该实时信号获取模块630可以是一个接口,用于接收其他设备发送的实时音频信号,该实时音频信号可以是经过或未经过如下至少一种处理的数字信号:消除信号混叠、消除抖动、消除过采样、噪声抑制、回声消除或增益控制。应当理解,这里的实时表示声源发出声音与获取该声源发出的声音之间不存在时间延迟,或者存在的时间延迟很小可以忽略不计。
可选的,在获取实时音频信号之前,该实时数字混音的方法还包括:获取实时模拟音频信号。可选的,装置600可以包括音频输入设备和模拟-数字转换器ADC,该实时模拟音频信号可以通过音频输入设备获取,进一步的,该实时模拟音频信号通过ADC转换得到上述的实时音频信号。可选的,该音频输入设备可以为装置自带的麦克风、声敏装置或其他具有声音采集功能的设备,例如可以是图1中的麦克风161、图2中的麦克风220或者图4中的麦克风460;可选的,该音频输入设备可以为图2中示出的附加设备,例如可以是无线耳麦或有线耳麦;可选的,该音频输入设备可以是具有声音采集功能的芯片,或者可以是连接于麦克风、声敏装置或其他具有声音采集功能的设备的音频编解码器(Codec)。
该实时数字混音的方法包括步骤508,在数字混音模块中将至少一帧音频数据与实时音频信号混合,以得到混音音频信号。例如,该至少一帧音频数据可以是背景音乐、旁白、伴奏音乐等,实时音频信号则是来源于人类或自然界的声音,从而实现混音。
可选的,该数字混音模块可以是装置600的数字混音模块640;可选的,该数字混音模块可以是如图4所示的声音处理模块440中的数字混音模块441;可选的,该数字混音模块可以是一个软件模块,例如可以是一个函数;可选的,该数字混音模块也可以通过硬件逻辑实现;可选的,该数字混音模块可以是独立的硬件,例如可以是协处理器、微处理器或其他处理器核。在一种可选的方案中,当数字混音模块是软件模块时,可以在音视频软件架构的应用层中添加与该数字混音模块对应的新的API接口,并通过该数字混音接口实现与内部数字混音模块的交互。例如该API接口可以是数字混音接口,可以通过该数字混音接口向数字混音模块发送控制信息(如图4箭头向下的虚线所示)以实现对数字混音模块的控制,或者该数字混音接口可以接收来自数字混音模块的音频数据(如图4从数字混音模块到数字混音接口的箭头向上的实线所示),具体的,该音频数据可以是实时混音得到的混音音频信号,具体的,该控制信息可以包括数字混音模块使能信息或数字混音模块禁用信息等。可选的,图4中所示的音频来源管理接口可以为AudioRecord,AudioRecord管理着音频来源,负责操作系统平台(例如可以是Android 平台)上音频数据的采集,例如可以使用平台的音频输入硬件来录音。
在一种可选的方案中,在将至少一帧音频数据与实时音频信号混合之前,可以对实时音频信号进行如下至少一种处理:消除信号混叠、消除抖动、消除过采样、噪声抑制、回声消除或增益控制。可选的,上述处理可以在装置600的预处理模块680中执行。该预处理模块680可以由软件模块或硬件逻辑实现,例如可以是HiFi中一个软件模块或集成在HiFi中的一块硬件逻辑,可选的,该预处理模块可以与数字混音模块640集成在一个芯片中,也可以是独立的芯片。可选的,该预处理模块可以是图4中的预处理模块442或者可以是装置100的声音处理系统132中的一个预处理模块。
在一种可选的方案中,上述至少一种处理可以在音频编解码器中实现,可选的,该音频编解码器可以是图4中的音频编解码器450,该音频编解码器也可以是集成在装置100的音频电路160中的音频编解码模块或者是位于处理系统130中的音频编解码器。进一步的,在获取实时音频信号之前,获取模拟音频信号,该处理还可以包括模拟-数字转换(Analog-to-Digital Conversion,ADC),将模拟音频信号转换为数字的实时音频信号。
在混音之前对实时音频信号进行上述处理,可以提高获取的实时音频信号的音质,减少实时音频信号中所包含的杂音,避免在混音过程中引入不必要的干扰,还可以避免在混音过程中出现音频溢出,避免混音音频失真。
在一种可选的方案中,当需要进行实时数字混音时,可以通过触摸显示屏430上与数字混音接口相关联的功能图标,向数字混音模块发送使能信息,激活该数字混音模块,将来自音频输入设备的实时音频信号与播放器调用的至少一帧音频数据进行混合,从而得到混音音频信号。
进一步的,如图4所示,在获得混音音频信号之后,可以直接传送到与音频编解码相耦合的扬声器470进行实时播放;可选的,也可以将获得的混音音频信号通过系统软件中向上的数据通路传送到位于应用层的数字混音接口提供给上层应用进行后续操作。可选的,该后续操作可以包括如下至少一项:将获得的混音音频信号保存在本地存储器420中,将获得的混音音频信号上传到网络端或将获得的混音音频信号传送给第三方媒体播放器进行播放,例如,该第三方媒体播放器可以是网络直播平台、各种音乐播放器、视频播放器等。
可选的,该实时数字混音方法可以包括步骤509,获取视频图像信号。
在一种可选的方案中,在音频输入设备获取实时音频信号的同时,获取视频图像信号,具体的,可以通过装置自带的摄像头或者其他具有图像采集功能的设备获取视频图像信号,该视频图像信号为连续多帧的图像信号。在一种可选的方案中,可以从本地存储器或者网络端获取该视频图像信号,该视频图像信号可以是多帧在时间或空间上连续的图片或者几张非连续的图片形成的视频信号。
可选的,该实时数字混音方法可以包括步骤510,将视频图像信号与混音音频信号混合,以得到混音视频信号。
可选的,可以将步骤508中得到的混音音频信号和步骤509中得到的视频图像信号传送给视频处理模块进行融合以得到混音视频信号,可选的,该视频处理模块可以是装置600的视频处理模块670。可选的,该视频处理模块670可以是存储在存储器中的一个软件模块、也可以由硬件逻辑电路实现,该视频处理模块还可以是一个独立的芯片;例如,该视频处理模块可以是视频编解码器。在一种可选的方案中,该视频处理模块可以 是装置100的媒体处理系统131中的一个软件模块或一块硬件电路,或者可以是图4中存储在存储器420中的用于实现视频混音的软件模块。
进一步的,在得到混音视频信号之后,可以传送给视频播放器进行播放,也可以通过网络实时共享给多个互联网用户,或者将该混音视频信号存储在本地存储器中以便用户后续回放。
应当理解,本申请所提供的装置实施例仅仅是示意性的,图6中的单元划分仅仅是一种逻辑功能划分,实际实现时可以有另外的划分方式。例如多个模块可以结合或者可以集成到另一个系统。各个模块相互之间的耦合可以是通过一些接口实现,这些接口通常是电性通信接口,但是也不排除可能是机械接口或其它的形式接口。因此,作为分离部件说明的模块可以是或者也可以不是物理上分开的,既可以位于一个地方,也可以分布到同一个或不同设备的不同位置上。
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述任一个实时数字混音方法中的一个或多个步骤。上述装置的各组成模块如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在所述计算机可读取存储介质中。
基于这样的理解,本申请实施例还提供一种包含指令的计算机程序产品,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备、移动终端或其中的处理器执行本申请各个实施例所述方法的全部或部分步骤。该存储介质的种类请参考存储器140的相关描述。例如,声音处理模块440可以是以软件实现的。此时的声音处理模块440可以使运行于应用处理器410中的软件形成的运算单元。也就是说,应用处理器410通过运行软件指令实现本发明实施例的相关方法流程。
本实施例将媒体播放接口获取的至少一帧音频数据与源自于音频输入设备的实时音频信号进行实时混合,且媒体播放接口以帧为单位获取音频文件,而不是获取整个音频文件,在混音过程中可以通过媒体播放接口随时灵活切换至其他音频文件的任一帧音频数据,实现了实时的数字混音。且用于传输至少一帧音频数据的是系统软件中的数据通路,不会引入外部环境噪声。进一步地,由于混音是实时进行的,且复用了现有的音乐播放流程,通过媒体播放接口调用应用层软件的至少一帧音频数据,实现简单且灵活性高。
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。例如,装置实施例中的一些具体操作可以参考之前的方法实施例。

Claims (22)

  1. 一种音频信号处理的方法,其特征在于,该方法包括:
    通过媒体播放接口获取至少一帧音频数据,所述媒体播放接口为应用程序编程接口;
    将所述至少一帧音频数据通过系统软件中的数据通路传送到数字混音模块;
    获取实时音频信号;
    在所述数字混音模块中将所述至少一帧音频数据与所述实时音频信号混合,以得到混音音频信号。
  2. 根据权利要求1所述的方法,其特征在于,在所述将所述至少一帧音频数据通过系统软件中的数据通路传送到数字混音模块之前,所述方法还包括:
    将所述至少一帧音频数据从压缩形态解码成原始数据形态。
  3. 根据权利要求1或2所述的方法,其特征在于,所述系统软件包括操作系统软件。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述数据通路包括如下至少一项:音轨源节点、音频调控器、音频硬件抽象层或硬件驱动层。
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述将所述至少一帧音频数据通过系统软件中的数据通路传送到数字混音模块的过程中,禁止所述至少一帧音频数据的播放。
  6. 根据权利要求5所述的方法,其特征在于,所述禁止所述至少一帧音频数据的播放包括如下至少一项:
    通过所述数据通路中的音频调控器关闭所述至少一帧音频数据的音频输出数据流;或
    基于所述数据通路中的音频硬件抽象层控制所述数据通路中的硬件驱动层禁用所述至少一帧音频数据的音频输出设备。
  7. 根据权利要求1-6任一项所述的方法,其特征在于,在所述获取实时音频信号之前,所述方法还包括:
    检测是否有所述实时音频信号输入;
    当检测到有所述实时音频信号输入时,减小所述至少一帧音频数据的音量。
  8. 根据权利要求7所述的方法,其特征在于,所述减小所述至少一帧音频数据的音量包括如下至少一项:
    通过控制所述数据通路中的所述音频调控器减小所述至少一帧音频数据的音量;或
    通过控制所述数字混音模块减小所述至少一帧音频数据的音量。
  9. 根据权利要求1-8任一项所述的方法,其特征在于,所述在所述数字混音模块中将所述至少一帧音频数据与所述实时音频信号混合之前,所述方法还包括:
    对所述实时音频信号进行如下至少一种处理:消除信号混叠、消除抖动、消除过采样、噪声抑制、回声消除或增益控制。
  10. 一种音频信号处理的装置,其特征在于,该装置包括:媒体播放接口、位于系统软件中的数据通路、实时信号获取模块和数字混音模块;
    所述媒体播放接口,用于获取至少一帧音频数据,所述媒体播放接口为应用程序编程接口;
    所述数据通路,用于将所述至少一帧音频数据传送到所述数字混音模块;
    所述实时信号获取模块,用于获取实时音频信号;
    所述数字混音模块,用于将所述至少一帧音频数据与所述实时音频信号混合,以得到混音音频信号。
  11. 根据权利要求10所述的装置,其特征在于,还包括:解码模块,用于在所述数据通路将所述至少一帧音频数据传送到所述数字混音模块之前,将所述至少一帧音频数据从压缩形态解码成原始数据形态。
  12. 根据权利要求10或11所述的装置,其特征在于,所述系统软件包括操作系统软件。
  13. 根据权利要求10-12任一项所述的装置,其特征在于,所述数据通路包括如下至少一项:音轨源节点、音频调控器、音频硬件抽象层或硬件驱动层。
  14. 根据权利要求10-13任一项所述的装置,其特征在于,所述数据通路进一步用于:在将所述至少一帧音频数据传送到所述数字混音模块的过程中,禁止所述至少一帧音频数据的播放。
  15. 根据权利要求14所述的装置,其特征在于,所述数据通路中的音频调控器,用于关闭所述至少一帧音频数据的音频输出数据流;或
    所述数据通路中的音频硬件抽象层,用于控制所述数据通路中的硬件驱动层禁用所述至少一帧音频数据的音频输出设备。
  16. 根据权利要求10-15任一项所述的装置,其特征在于,还包括:音频检测模块,用于:在所述实时信号获取模块获取实时音频信号前,检测是否有所述实时音频信号输入;当检测到有所述实时音频信号输入时,控制减小所述至少一帧音频数据的音量。
  17. 根据权利要求16所述的装置,其特征在于,所述音频检测模块用于执行如下至少一项:
    给所述数据通路中的所述音频调控器发送所述控制信号以减小所述至少一帧音频数据的音量;或
    给所述数字混音模块发送所述控制信号以减小所述至少一帧音频数据的音量。
  18. 根据权利要求10-17任一项所述的装置,其特征在于,所述装置还包括预处理模块,用于:对所述实时音频信号进行如下至少一种处理:消除信号混叠、消除抖动、消除过采样、噪声抑制、回声消除或增益控制。
  19. 一种音频信号处理的装置,其特征在于,该装置包括处理器;
    所述处理器被配置为读取存储器中的软件指令,执行所述软件指令以实现如下操作:
    通过媒体播放接口获取至少一帧音频数据,所述媒体播放接口为应用程序编程接口;
    将所述至少一帧音频数据通过系统软件中的数据通路传送到数字混音模块;
    获取实时音频信号;
    在所述数字混音模块中将所述至少一帧音频数据与所述实时音频信号混合,以得到混音音频信号。
  20. 一种音频信号处理的装置,其特征在于,该装置包括:处理器和音频处理器;
    所述处理器被配置为读取存储器中的软件指令,执行所述软件指令以实现如下操作:
    通过媒体播放接口获取至少一帧音频数据,所述媒体播放接口为应用程序编程接口;
    将所述至少一帧音频数据通过系统软件中的数据通路传送到所述音频处理器;
    所述音频处理器,用于:
    获取实时音频信号;
    将所述至少一帧音频数据与所述实时音频信号混合,以得到混音音频信号。
  21. 一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机或处理器上运行时,使得所述计算机或处理器执行如权利要求1-9任一项所述的方法。
  22. 一种包含指令的计算机程序产品,当其在计算机或处理器上运行时,使得所述计算机或处理器执行如权利要求1-9任一项所述的方法。
PCT/CN2018/105037 2017-09-26 2018-09-11 一种实时数字音频信号混音的方法及装置 WO2019062541A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710877934.7A CN109559763B (zh) 2017-09-26 2017-09-26 一种实时数字音频信号混音的方法及装置
CN201710877934.7 2017-09-26

Publications (1)

Publication Number Publication Date
WO2019062541A1 true WO2019062541A1 (zh) 2019-04-04

Family

ID=65861908

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/105037 WO2019062541A1 (zh) 2017-09-26 2018-09-11 一种实时数字音频信号混音的方法及装置

Country Status (2)

Country Link
CN (2) CN112863474A (zh)
WO (1) WO2019062541A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3902147A1 (en) * 2020-04-26 2021-10-27 Yealink (Xiamen) Network Technology Co., Ltd. Wireless communication device, and method and apparatus for processing voice data

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI703559B (zh) * 2019-07-08 2020-09-01 瑞昱半導體股份有限公司 音效編碼解碼電路及音頻資料的處理方法
CN112533056B (zh) * 2019-09-17 2022-10-28 海信视像科技股份有限公司 一种显示设备及声音再现方法
CN112825245B (zh) * 2019-11-20 2023-04-28 北京声智科技有限公司 实时修音方法、装置及电子设备
CN111142978B (zh) * 2019-12-27 2024-01-30 杭州涂鸦信息技术有限公司 一种基于智能语音设备上打电话的方法及系统
CN111182432B (zh) * 2019-12-30 2022-04-22 重庆电子工程职业学院 多场景自适应智能扩音系统
CN111491176B (zh) * 2020-04-27 2022-10-14 百度在线网络技术(北京)有限公司 一种视频处理方法、装置、设备及存储介质
CN111863011B (zh) * 2020-07-30 2024-03-12 北京达佳互联信息技术有限公司 音频处理方法及电子设备
US11979448B1 (en) * 2020-08-24 2024-05-07 Shared Space Studios Inc. Systems and methods for creating interactive shared playgrounds
CN112423211A (zh) * 2020-10-26 2021-02-26 努比亚技术有限公司 一种多音频传输控制方法、设备及计算机可读存储介质
CN114816312A (zh) * 2021-01-18 2022-07-29 博泰车联网(南京)有限公司 一种音频数据处理方法及装置
CN113096674B (zh) * 2021-03-30 2023-02-17 联想(北京)有限公司 一种音频处理方法、装置及电子设备
CN112951197B (zh) * 2021-04-02 2022-06-24 北京百瑞互联技术有限公司 一种音频混音方法、装置、介质及设备
CN112995541B (zh) * 2021-04-26 2021-08-13 北京易真学思教育科技有限公司 视频回音的消除方法及计算机存储介质
CN115842885A (zh) * 2021-09-18 2023-03-24 北京小米移动软件有限公司 车辆通话方法、装置、电子设备及存储介质
CN114579077B (zh) * 2022-02-23 2024-04-16 青岛海信宽带多媒体技术有限公司 一种音量控制设备
CN115529379B (zh) * 2022-03-22 2023-06-20 荣耀终端有限公司 防止蓝牙音频Track音轨抖动的方法、电子设备及存储介质
CN117135532B (zh) * 2023-04-28 2024-06-14 荣耀终端有限公司 音频数据处理方法、设备和存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103268763A (zh) * 2013-06-05 2013-08-28 广州市花都区中山大学国光电子与通信研究院 一种基于同步音频提取和实时传输的无线影音系统
CN103310776A (zh) * 2013-05-29 2013-09-18 亿览在线网络技术(北京)有限公司 一种实时混音的方法和装置
US20140369528A1 (en) * 2012-01-11 2014-12-18 Google Inc. Mixing decision controlling decode decision
CN106558314A (zh) * 2015-09-29 2017-04-05 广州酷狗计算机科技有限公司 一种混音处理方法和装置及设备
CN106782576A (zh) * 2017-02-15 2017-05-31 合网络技术(北京)有限公司 音频混音方法及装置
CN107040496A (zh) * 2016-02-03 2017-08-11 中兴通讯股份有限公司 一种音频数据处理方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9351069B1 (en) * 2012-06-27 2016-05-24 Google Inc. Methods and apparatuses for audio mixing
CN103983272B (zh) * 2014-05-05 2017-01-11 惠州华阳通用电子有限公司 基于Android平台的车机适配导航软件的方法
CN105429984B (zh) * 2015-11-27 2019-03-15 刘军 媒体播放方法、设备及音乐教学系统
CN106131472A (zh) * 2016-07-26 2016-11-16 维沃移动通信有限公司 一种录像方法及移动终端

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140369528A1 (en) * 2012-01-11 2014-12-18 Google Inc. Mixing decision controlling decode decision
CN103310776A (zh) * 2013-05-29 2013-09-18 亿览在线网络技术(北京)有限公司 一种实时混音的方法和装置
CN103268763A (zh) * 2013-06-05 2013-08-28 广州市花都区中山大学国光电子与通信研究院 一种基于同步音频提取和实时传输的无线影音系统
CN106558314A (zh) * 2015-09-29 2017-04-05 广州酷狗计算机科技有限公司 一种混音处理方法和装置及设备
CN107040496A (zh) * 2016-02-03 2017-08-11 中兴通讯股份有限公司 一种音频数据处理方法和装置
CN106782576A (zh) * 2017-02-15 2017-05-31 合网络技术(北京)有限公司 音频混音方法及装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3902147A1 (en) * 2020-04-26 2021-10-27 Yealink (Xiamen) Network Technology Co., Ltd. Wireless communication device, and method and apparatus for processing voice data
US11545161B2 (en) 2020-04-26 2023-01-03 Yealink (Xiamen) Network Technology Co., Ltd. Wireless communication device, and method and apparatus for processing voice data

Also Published As

Publication number Publication date
CN109559763B (zh) 2021-01-15
CN112863474A (zh) 2021-05-28
CN109559763A (zh) 2019-04-02

Similar Documents

Publication Publication Date Title
WO2019062541A1 (zh) 一种实时数字音频信号混音的方法及装置
CN105872253B (zh) 一种直播声音处理方法及移动终端
US11251763B2 (en) Audio signal adjustment method, storage medium, and terminal
WO2016177296A1 (zh) 一种生成视频的方法和装置
CN106531177B (zh) 一种音频处理的方法、移动终端以及系统
US10834503B2 (en) Recording method, recording play method, apparatuses, and terminals
US20140105411A1 (en) Methods and systems for karaoke on a mobile device
US20070087686A1 (en) Audio playback device and method of its operation
WO2019033986A1 (zh) 声音播放器件的检测方法、装置、存储介质及终端
WO2019033987A1 (zh) 提示方法、装置、存储介质及终端
JP6717940B2 (ja) オーディオファイルの再録音方法、装置及び記憶媒体
US20140241702A1 (en) Dynamic audio perspective change during video playback
WO2017215507A1 (zh) 一种音效处理方法及移动终端
US9230529B2 (en) Music reproducing apparatus
WO2022267468A1 (zh) 一种声音处理方法及其装置
CN106303841B (zh) 一种音频播放方式的切换方法及移动终端
CN116795753A (zh) 音频数据的传输处理的方法及电子设备
WO2022062979A1 (zh) 音频处理方法、计算机可读存储介质、及电子设备
US20140223500A1 (en) Method and system for transmitting wirelessly video in portable terminal
WO2017185602A1 (zh) 一种耳机模式切换方法和电子设备
WO2015117550A1 (en) Method and apparatus for acquiring reverberated wet sound
US20100174825A1 (en) Internet radio systems and methods thereof
CA2785958C (en) Mobile communication device with receiver speaker
CN106293607B (zh) 自动切换音频输出模式的方法及系统
CN114786116A (zh) 会议一体机的声音处理方法、会议一体机以及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18861695

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18861695

Country of ref document: EP

Kind code of ref document: A1