WO2022247455A1 - 一种音频分流的方法及电子设备 - Google Patents

一种音频分流的方法及电子设备 Download PDF

Info

Publication number
WO2022247455A1
WO2022247455A1 PCT/CN2022/084069 CN2022084069W WO2022247455A1 WO 2022247455 A1 WO2022247455 A1 WO 2022247455A1 CN 2022084069 W CN2022084069 W CN 2022084069W WO 2022247455 A1 WO2022247455 A1 WO 2022247455A1
Authority
WO
WIPO (PCT)
Prior art keywords
application
audio
applications
electronic device
user
Prior art date
Application number
PCT/CN2022/084069
Other languages
English (en)
French (fr)
Inventor
宋孟
李洪江
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022247455A1 publication Critical patent/WO2022247455A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/165Management of the audio stream, e.g. setting of volume, audio stream path
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/162Interface to dedicated audio devices, e.g. audio drivers, interface to CODECs

Definitions

  • the present application relates to the field of electronic technology, and in particular to an audio distribution method and electronic equipment.
  • a PC has different audio endpoint devices (audio endpoint devices) such as earphones, microphones, and speakers, which can be called “audio devices” or "endpoints" of the PC.
  • audio endpoint devices such as earphones, microphones, and speakers
  • the audio device may not be limited to a built-in device or an external device of the PC, for example, an earphone is connected to the PC through an earphone jack, the earphone may be used as an audio playback device of the PC, and the speaker may be a built-in device of the PC.
  • the user may connect an earphone through an earphone port of the PC, and use the earphone to simultaneously answer the conference call and listen to the audio corresponding to the video.
  • the user may expect to use a certain translation software to translate the subtitles of the currently playing video.
  • the video segment being played is English subtitles.
  • the acquired video can be translated in real time.
  • the English audio is translated, and the translated Chinese subtitles are displayed in the video playback window for users to view.
  • the translation software acquires mixed audio including the audio of the conference call and the audio of the current video clip, which affects the accuracy of translation. If the user uses more playback applications at the same time, the translation software cannot obtain the audio of the current video from the mixed audio of more applications, which affects the translation process and translation accuracy, and reduces the user experience.
  • the user performs operations such as voice-to-text conversion on the audio of the conference call through applications such as Smart Voice. Text accuracy.
  • the present application provides an audio distribution method and electronic equipment.
  • the method provides users with the ability to distribute audio services, without requiring users to switch audio drivers and audio devices, and realizes the rapid and accurate completion of audio data distribution. Furthermore, this process does not require additional virtual audio drivers and virtual audio devices, and does not affect the use of special sound effects, thereby improving user experience.
  • a method for splitting audio which is characterized in that it is applied to an electronic device including one or more audio devices, and the method includes: receiving a first operation of a user, and responding to the first operation, The electronic device plays audio corresponding to M applications through the first audio device, wherein the first audio device is any one of the one or more audio devices, and each of the M applications is An application capable of outputting corresponding audio, M is an integer greater than or equal to 2; receiving a second operation from the user, the second operation is used to request to obtain the audio data of the first application, and the first application is the M Any one of the applications; in response to the second operation, determine the playback resource of the first application from the playback resources associated with the M applications; from the playback resource of the first application, obtain the Audio data of the first application.
  • M applications in this embodiment of the present application may be playback applications.
  • the playback applications may include any one or Various.
  • the electronic device may include different audio devices (endpoints) such as a microphone, an earphone, and a speaker.
  • endpoints such as a microphone, an earphone, and a speaker.
  • the user can set the PC to play audio from different applications through any one or more audio devices such as earphones and speakers.
  • This application is mainly aimed at the scenario where the PC plays audio of different applications through any audio device such as earphones and speakers.
  • the audio splitting process provided by this application can be based on The software structure of the system, and an additional audio service platform needs to be built.
  • the audio service platform can be realized through an audio service-related software development kit.
  • the audio service-related software development kit can provide audio service modules and audio processing modules. .
  • the audio service module can be implemented by the user pre-installing an audio service program provided by Huawei's manufacturer (such as Huawei audio service provided by Huawei), and the audio processing module can be implemented by the user pre-installing the audio service provided by Huawei's manufacturer.
  • Software development kit (such as Huawei audio software development kit, Huawei audio SDK provided by Huawei) to achieve.
  • the user can use the voice-to-text function of the smart voice application to convert the audio of the current conference call into text and record it, then the audio processing module of the electronic device can convert the obtained conference call application The audio data is returned to the smart voice application, so that the smart voice application can convert the current audio data into text.
  • the audio processing module of the electronic device can return the acquired audio data of the video application to the translation software for This enables the translation software to convert the current English audio into Chinese subtitles and display them on the screen in real time.
  • the user may use a multi-screen interactive application to switch the video of the PC to other electronic devices such as a tablet or a smart screen for display, then the audio processing module of the electronic device may display the acquired audio data and images of the video application The data and the like are returned to the multi-screen interactive application, so that the multi-screen interactive application can switch to other electronic devices such as a tablet, a smart screen, etc. according to the audio and images corresponding to the video of the PC, and no more examples are given here.
  • the audio service module obtains each The process information of the application, and send the process information to the APO module, so that the APO module can accurately match the corresponding SFX according to the process information of each application, that is, specify a unique SFX for each application, and then the user can accurately obtain multiple Audio data corresponding to each of the applications.
  • This process does not require users to switch audio drivers and audio devices through tedious operations, which simplifies the operation process.
  • the method can simultaneously split the audio data of multiple application programs from the audio data of multiple application programs.
  • the audio data of each application needs to be obtained at the same time, it can be provided to the user.
  • this process does not require additional virtual audio drivers and virtual audio devices, and does not affect the use of special sound effects, thereby improving user experience.
  • the first application may be multiple applications, that is, the audio data of the conference call and the audio data of the video application are acquired simultaneously.
  • the audio processing module receives the request message from the smart voice application and When translating the request message of the software, the audio processing module can determine the playback resource SFX3 according to the pid of the conference call, and obtain the audio data of SFX3, and determine the playback resource SFX2 according to the pid of the video application, and obtain the audio data of SFX2, and execute the corresponding
  • audio data of multiple different applications can be obtained simultaneously, and audio data can be split.
  • the method further includes: determining a mapping relationship between the M process identifiers corresponding to the M applications and the playback resources corresponding to the M applications, Wherein, each of the M applications corresponds to a process identifier, and each of the M applications corresponds to a playback resource.
  • the electronic device can determine the pid of each currently running application through WASAPI.
  • the music application corresponds to pid 1
  • the video application corresponds to pid 2
  • the conference call application corresponds to pid 3.
  • the APO module creates a playback resource SFX for each application being played, and establishes a link between the pid of each application and the playback resource SFX.
  • One-to-one correspondence is one-to-one correspondence.
  • the APO module may send the information formula of the one-to-one correspondence between the pid and the SFX to the audio processing module (such as Huawei audio SDK).
  • the APO module can periodically send information about the one-to-one correspondence between pid and SFX to the audio processing module (such as Huawei audio SDK), or the APO module receives a request from the audio processing
  • the processing module sends the information of the one-to-one correspondence between pid and SFX, and this application does not limit the timing and method for the APO module to send the information of the one-to-one correspondence between pid and SFX to the audio processing module.
  • the determining the playback resources of the first application from the playback resources corresponding to the M applications includes: determining the first The process identifier of the application; according to the process identifier of the first application and the mapping relationship, determine the playback resource of the first application from the M playback resources.
  • the method further includes: the electronic device runs the M applications, and allocates M playback resources to the M applications; Acquiring the M process identifiers corresponding to the M applications.
  • the method further includes: receiving a third user operation, the third operation being used to request interruption of playing the audio of the first application ; In response to the third operation, the playback resource of the first application suspends receiving the audio data of the first application.
  • the method further includes: receiving a fourth user operation, the fourth operation being used to close the first application; responding to the In the fourth operation, the playing resource of the first application suspends receiving the audio data of the first application and releases the playing resource of the first application.
  • the M applications include any application among music applications, video applications, game applications, and meeting applications.
  • the audio service module obtains the process information of each application and sends the process information to APO Module, so that the APO module can accurately match the corresponding SFX according to the process information of each application, that is, specify a unique SFX for each application, and then users can accurately obtain the audio data corresponding to each application in multiple applications .
  • This process does not require users to switch audio drivers and audio devices through tedious operations, which simplifies the operation process.
  • the method can simultaneously split the audio data of multiple application programs from the audio data of multiple application programs.
  • the audio data of each application needs to be obtained at the same time, it can be provided to the user.
  • this process does not require additional virtual audio drivers and virtual audio devices, and does not affect the use of special sound effects, thereby improving user experience.
  • an electronic device which is characterized in that it includes: one or more audio devices; one or more processors; one or more memories; modules installed with multiple application programs; There are one or more programs, and when the one or more programs are executed by the processor, the electronic device performs the following steps: receiving a first user operation, responding to the first operation, through the first The audio device plays audio corresponding to M applications, wherein the first audio device is any one of the one or more audio devices, and each of the M applications is an application capable of outputting corresponding audio , M is an integer greater than or equal to 2; receiving a second operation from the user, the second operation is used to request audio data of a first application, and the first application is any one of the M applications; In response to the second operation, determine the playback resource of the first application from the playback resources associated with the M applications; acquire the audio data of the first application from the playback resource of the first application.
  • the electronic device when the one or more programs are executed by the processor, the electronic device is made to perform the following steps: determine the M Mapping relationship between process identifiers and playback resources corresponding to the M applications, wherein each of the M applications corresponds to a process identifier, and each of the M applications corresponds to a playback resource resource.
  • the electronic device when the one or more programs are executed by the processor, the electronic device is made to perform the following steps: determine the first The process identifier of the application; according to the process identifier of the first application and the mapping relationship, determine the playback resource of the first application from the M playback resources.
  • the electronic device when the one or more programs are executed by the processor, the electronic device is made to perform the following steps: run the M applications, and allocate M playback resources for the M applications; and acquire the M process identifiers corresponding to the M applications.
  • the electronic device when the one or more programs are executed by the processor, the electronic device is made to perform the following steps: receiving the user's third Operation, the third operation is used to request to interrupt playing the audio of the first application; in response to the third operation, the playback resource of the first application suspends receiving the audio data of the first application.
  • the electronic device when the one or more programs are executed by the processor, the electronic device is made to perform the following steps: receiving the user's fourth Operation, the fourth operation is used to close the first application; in response to the fourth operation, the playback resource of the first application suspends receiving the audio data of the first application and releases the audio data of the first application Play resource.
  • the M applications include any application among music applications, video applications, game applications, and conference applications.
  • an electronic device including: one or more processors; one or more memories; a module installed with multiple application programs; the memory stores one or more programs, when the one or more When a program is executed by the processor, the electronic device is made to perform the steps performed in any possible implementation of any of the above aspects.
  • a graphical user interface on an electronic device has a display screen, a memory, and one or more processors, and the one or more processors are used to execute one or more programs stored in the memory.
  • a plurality of computer programs, the graphical user interface includes the graphical user interface displayed in the method for the electronic device to perform any possible audio distribution in any of the above aspects.
  • an apparatus which is included in an electronic device, and has a function of realizing the behavior of the electronic device in the above first aspect and possible implementation manners of the above first aspect.
  • This function may be implemented by hardware, or may be implemented by executing corresponding software on the hardware.
  • Hardware or software includes one or more modules or units corresponding to the functions described above. For example, a display module or unit, a detection module or unit, a processing module or unit, etc.
  • a computer storage medium including computer instructions, and when the computer instructions are run on the electronic device, the electronic device is made to perform any possible audio splitting method in any one of the above aspects.
  • a computer program product is provided.
  • the electronic device is made to perform any possible audio splitting method in any one of the above aspects.
  • FIG. 1 is a schematic diagram of a scenario where a user uses multiple applications on a PC.
  • FIG. 2 is a schematic diagram of an example of audio splitting process.
  • FIG. 3 is a schematic structural diagram of an example of an electronic device provided by an embodiment of the present application.
  • Fig. 4 is a block diagram of a software structure of an electronic device provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of an example of an audio splitting process provided by an embodiment of the present application.
  • Fig. 6 is a schematic diagram of an audio stream processing procedure provided by an embodiment of the present application.
  • FIG. 7 is a sequence diagram of an example of audio stream processing provided by the embodiment of the present application.
  • FIG. 1 is a schematic diagram of a scenario where a user uses multiple applications on a PC. As shown in (a) figure among Fig. 1, the user uses PC 100 to play video segment, and this video is played in the window 10 on the display screen of PC 100; A conference call window 20 is also displayed on the screen. It should be understood that the embodiment of the present application does not limit the application of playing video and the application of conference call.
  • PC 100 may include various audio endpoints such as microphones, headphones, and speakers. Specifically, for the audio output by the PC 100, the user can set the PC 100 to play audio from different applications through any one or more audio devices such as earphones and speakers.
  • the user can set the PC 100 to play the audio received during the conference call through the speaker 110, and simultaneously play the audio corresponding to the video through the speaker 110.
  • the PC 100 can be set to play the audio received during the conference call through the external earphone 120, and at the same time play the audio corresponding to the video through the speaker 110.
  • the PC 100 can be set to play the audio corresponding to the video through the speaker 110, and play the audio received during the conference call through the external earphone 120.
  • the earphone 120 can also receive the voice of the current user and input it to the PC 100 through the microphone.
  • the name of the microphone can be recorded as "Synaptics HD BXX".
  • the embodiment of the present application focuses on the earphone 120 As an audio playback device, the process of outputting the audio of one or more playback applications of the PC 100, the process of receiving the voice of the current user by the earphone 120 as an audio receiving device will not be described in detail.
  • the user can also run a certain translation software to obtain the English subtitles in the window 10 of the currently playing video in real time through the translation software, and to obtain the English subtitles. Audio for translation.
  • this process relies on the Windows audio session application programming interface (WASAPI) of Microsoft Audio Dialogue.
  • WASAPI Windows audio session application programming interface
  • the translation software obtains the mixed audio of the conference call and video, which cannot achieve accurate translation.
  • the user wants to convert the audio of the current conference call into text, he can use the voice-to-text function of Smart Voice.
  • the accuracy rate affects the efficiency of audio conversion to text and reduces the user experience.
  • the user when the user expects to switch the video on the PC 100 to be played on other electronic devices such as a tablet or a smart screen, he can use a multi-screen application to switch the playback screen in the video window 10 to the tablet or smart screen. and other electronic devices, but this process cannot separate the audio corresponding to the video from the mixed audio, that is, it is impossible to switch the playback screen in window 10 of the video to other electronic devices such as tablets and smart screens, and at the same time the corresponding audio of the video The audio is also switched to other electronic devices such as tablets and smart screens, which reduces the user's multi-screen experience.
  • the same endpoint collects mixed audio from multiple applications, which cannot realize functions such as real-time translation and multi-screen switching, which reduces the user experience.
  • FIG. 2 is a schematic diagram of an example of audio splitting process.
  • the method mainly uses the virtual sound card technology to extract the audio corresponding to a certain application from the mixed audio of multiple applications.
  • the user Before introducing the virtual sound card technology, the user first needs to install the virtual audio driver on the PC. After installation, the virtual sound card (voice meeter) is automatically generated on the PC.
  • the audio driver is used to simultaneously play the audio of the conference call, the audio of the video and the audio of the music application through the same audio device (such as the external earphone 120).
  • the user can manually set the audio of the conference call to be played through the virtual sound card.
  • the user can set the audio of the conference call to be mounted under the virtual audio driver through the relevant options in the "Application Volume and Device Preferences" menu, that is, through the virtual sound card to play.
  • the process may include the steps shown in Figure 2:
  • Step 1 Playing applications such as teleconferencing, video applications, and music applications specify corresponding processes for each application through the API, that is, assign a procedure identifier (pid) to each application.
  • pid procedure identifier
  • the "playing applications" in this embodiment of the application may include applications that have corresponding audio output during operation, such as music applications, video applications, teleconferencing applications, game applications, etc., and the embodiments of this application will not give examples one by one. .
  • Step 2 The user activates the voice-to-text function of Smart Voice, triggers the Smart Voice application to request the pid of each running playback application through the API, and the user manually sets the audio of the conference call to be played through the virtual sound card.
  • the user can modify the output of the teleconferencing application to an audio device related to the virtual sound card in the "Application Volume and Device Preferences" menu according to the method shown in (b) of Figure 1, not here Let me repeat.
  • Step 3 according to the user's settings, the API determines the pid corresponding to the conference call, and mounts the process of the conference call to the virtual audio driver, and calls the virtual sound card to realize the audio playback of the conference call.
  • Step 4 mount the process of the music application and the process of the video application to the default audio driver of the system through the API, and call the default audio device of the system to play the audio of the music application and the audio of the video application.
  • Step 5 the smart voice application requests to obtain the audio data corresponding to the conference call on the virtual audio driver.
  • Step 6 Return the audio data corresponding to the conference call to the smart voice application, and the smart voice application performs voice-to-text conversion based on the acquired audio data.
  • step 2 in the above process can be implemented through WASAPI, and step 3 can be implemented through other interfaces.
  • step 3 can be implemented through other interfaces.
  • the embodiment of the present application is specific to the API type in the implementation process etc. are not limited.
  • the audio of the conference call can be separated from the mixed audio of the conference call, video application and music application running on the current PC.
  • This process requires the user to manually set the conference call to be loaded through the virtual sound card, and then multiple It is very cumbersome to obtain the audio data corresponding to the virtual audio driver for the screen interactive application or smart voice application, and it needs to add a virtual audio driver node to realize it.
  • the audio data is mixed data after reaching the driver layer (such as the system default audio driver or virtual audio driver).
  • this technology can only obtain the audio data of a single application.
  • the user only sets the audio of the conference call through Virtual sound card to play, only get the audio data of the conference call.
  • the audio data splitting cannot be completed.
  • the embodiment of the present application provides an audio distribution method, which can realize the distribution of audio data of multiple applications.
  • the technical solution in the embodiment of the application will be described below with reference to the drawings in the embodiment of the application.
  • first and second are used for descriptive purposes only, and cannot be understood as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, a feature defined as “first” and “second” may explicitly or implicitly include one or more of these features.
  • the audio shunting method provided in the embodiment of the present application can be applied to mobile phones, tablet computers, wearable devices, vehicle-mounted devices, augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) devices, notebook computers, super mobile personal On electronic devices such as computers (ultra-mobile personal computer, UMPC), netbooks, personal digital assistants (personal digital assistant, PDA), the embodiments of the present application do not impose any restrictions on the specific types of electronic devices.
  • FIG. 3 is a schematic structural diagram of an electronic device 100 provided in an embodiment of the present application.
  • the electronic device 100 may correspond to the PC in FIG. 1, and the electronic device 100 includes a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charging management module 140, a power supply Management module 141, battery 142, antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, button 190, indicator 192 , a camera 193, a display screen 194, and a subscriber identification module (subscriber identification module, SIM) card interface 195, etc.
  • the sensor module 180 may include a touch sensor 180K, a fingerprint sensor 180H and the like.
  • the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the electronic device 100 .
  • the electronic device 100 may include more or fewer components than shown in the figure, or combine certain components, or separate certain components, or arrange different components.
  • the illustrated components can be realized in hardware, software or a combination of software and hardware.
  • the processor 110 may include one or more processing units, for example: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU) Wait. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor, AP
  • modem processor graphics processing unit
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller memory
  • video codec digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit
  • the processor 110 may be the nerve center and command center of the electronic device.
  • the processor 110 may complete instruction fetching according to the instruction, generate an operation control signal, and then execute the control of the instruction.
  • the processor 110 can be used to control the audio module 170 to collect audio data corresponding to multiple applications, and when the electronic device 100 displays the video window on other electronic devices such as tablets and smart screens, The processor 110 may control the output of the audio data corresponding to the video being played by the video window to other electronic devices such as a tablet and a smart screen.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is a cache memory.
  • the memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
  • processor 110 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transmitter (universal asynchronous receiver/transmitter, UART) interface, mobile industry processor interface (mobile industry processor interface, MIPI), general-purpose input and output (general-purpose input/output, GPIO) interface, subscriber identity module (subscriber identity module, SIM) interface, and /or universal serial bus (universal serial bus, USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transmitter
  • MIPI mobile industry processor interface
  • GPIO general-purpose input and output
  • subscriber identity module subscriber identity module
  • SIM subscriber identity module
  • USB universal serial bus
  • the USB interface 130 is an interface conforming to the USB standard specification, specifically, it can be a Mini USB interface, a Micro USB interface, a USB Type C interface, and the like.
  • the USB interface 130 can be used to connect a charger to charge the electronic device 100 , and can also be used to transmit data between the electronic device 100 and peripheral devices.
  • the USB interface 130 can also be used to connect an earphone, and play audio through the earphone. This interface can also be used to connect other electronic devices, such as AR devices.
  • the interface connection relationship between the modules shown in the embodiment of the present application is only a schematic illustration, and does not constitute a structural limitation of the electronic device 100 .
  • the electronic device 100 may also adopt different interface connection manners in the foregoing embodiments, or a combination of multiple interface connection manners.
  • the charging management module 140 is configured to receive a charging input from a charger.
  • the charger may be a wireless charger or a wired charger.
  • the charging management module 140 can receive charging input from the wired charger through the USB interface 130 .
  • the charging management module 140 may receive a wireless charging input through a wireless charging coil of the electronic device 100 . While the charging management module 140 is charging the battery 142 , it can also provide power for electronic devices through the power management module 141 .
  • the power management module 141 is used for connecting the battery 142 , the charging management module 140 and the processor 110 .
  • the power management module 141 receives the input from the battery 142 and/or the charging management module 140 to provide power for the processor 110 , the internal memory 121 , the external memory, the display screen 194 , the camera 193 , and the wireless communication module 160 .
  • the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, and battery health status (leakage, impedance).
  • the power management module 141 may also be disposed in the processor 110 .
  • the power management module 141 and the charging management module 140 may also be set in the same device.
  • the wireless communication function of the electronic device 100 can be realized by the antenna 1 , the antenna 2 , the mobile communication module 150 , the wireless communication module 160 , a modem processor, a baseband processor, and the like.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in electronic device 100 may be used to cover single or multiple communication frequency bands. Different antennas can also be multiplexed to improve the utilization of the antennas.
  • Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
  • the antenna may be used in conjunction with a tuning switch.
  • the mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G applied on the electronic device 100 .
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA) and the like.
  • the mobile communication module 150 can receive electromagnetic waves through the antenna 1, filter and amplify the received electromagnetic waves, and send them to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signals modulated by the modem processor, and convert them into electromagnetic waves through the antenna 1 for radiation.
  • at least part of the functional modules of the mobile communication module 150 may be set in the processor 110 .
  • at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be set in the same device.
  • a modem processor may include a modulator and a demodulator.
  • the modulator is used for modulating the low-frequency baseband signal to be transmitted into a medium-high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator sends the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low-frequency baseband signal is passed to the application processor after being processed by the baseband processor.
  • the application processor outputs sound signals through audio equipment (not limited to speaker 170A, receiver 170B, etc.), or displays images or videos through display screen 194 .
  • the modem processor may be a stand-alone device.
  • the modem processor may be independent from the processor 110, and be set in the same device as the mobile communication module 150 or other functional modules.
  • the wireless communication module 160 can provide wireless local area networks (wireless local area networks, WLAN) (such as wireless fidelity (Wireless Fidelity, Wi-Fi) network), bluetooth (bluetooth, BT), global navigation satellite, etc. System (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency-modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110 , frequency-modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
  • the antenna 1 of the electronic device 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology.
  • the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (general packet radio service, GPRS), code division multiple access (code division multiple access, CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time division code division multiple access (time-division code division multiple access, TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC , FM, and/or IR techniques, etc.
  • GSM global system for mobile communications
  • GPRS general packet radio service
  • code division multiple access code division multiple access
  • CDMA broadband Code division multiple access
  • WCDMA wideband code division multiple access
  • time division code division multiple access time-division code division multiple access
  • TD-SCDMA time-division code division multiple access
  • the GNSS may include a global positioning system (global positioning system, GPS), a global navigation satellite system (global navigation satellite system, GLONASS), a Beidou navigation satellite system (beidou navigation satellite system, BDS), a quasi-zenith satellite system (quasi -zenith satellite system (QZSS) and/or satellite based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • Beidou navigation satellite system beidou navigation satellite system
  • BDS Beidou navigation satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite based augmentation systems
  • the electronic device and other electronic devices can implement information transmission based on wireless communication technology by means of their respective antennas and mobile communication modules.
  • the PC can send the corresponding screen display data and audio data in the video playback window to other electronic devices such as tablets and mobile phones, and then play the video on other electronic devices such as tablets and mobile phones, which will not be repeated here.
  • the electronic device 100 realizes the display function through the GPU, the display screen 194 , and the application processor.
  • the GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
  • the display screen 194 is used to display images, videos and the like.
  • the display screen 194 includes a display panel.
  • the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode, AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), MiniLED, MicroLED, Micro-OLED, quantum dot light emitting diodes (quantum dot light emitting diodes, QLED), etc.
  • the electronic device 100 may include 1 or N display screens 194 , where N is a positive integer greater than 1.
  • the GPU may be used to render the application interface, and correspondingly, the display screen 194 may be used to display the application interface rendered by the GPU.
  • the GPU of the PC 100 can render the interface according to the image data corresponding to the video application, and display the picture of the video playback window 10 on the display screen of the PC 100, which will not be repeated here.
  • the electronic device 100 can realize the shooting function through the ISP, the camera 193 , the video codec, the GPU, the display screen 194 and the application processor.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, so as to expand the storage capacity of the electronic device 100.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. Such as saving music, video and other files in the external memory card.
  • the internal memory 121 may be used to store computer-executable program codes including instructions.
  • the processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121 .
  • the internal memory 121 may include an area for storing programs and an area for storing data.
  • the stored program area can store an operating system, at least one application program required by a function (such as a sound playing function, an image playing function, etc.) and the like.
  • the storage data area can store data created during the use of the electronic device 100 (such as audio data, phonebook, etc.) and the like.
  • the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
  • the electronic device 100 can implement audio functions through the audio module 170 , the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. Such as music playback, recording, etc.
  • the audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signal.
  • the audio module 170 may also be used to encode and decode audio signals.
  • the audio module 170 may be set in the processor 110 , or some functional modules of the audio module 170 may be set in the processor 110 .
  • Speaker 170A also referred to as a "horn" is used to convert audio electrical signals into sound signals.
  • Electronic device 100 can listen to music through speaker 170A, or listen to hands-free calls.
  • Receiver 170B also called “earpiece” is used to convert audio electrical signals into sound signals.
  • the receiver 170B can be placed close to the human ear to receive the voice.
  • the microphone 170C also called “microphone” or “microphone” is used to convert sound signals into electrical signals. When making a phone call or sending a voice message, the user can put his mouth close to the microphone 170C to make a sound, and input the sound signal to the microphone 170C.
  • the electronic device 100 may be provided with at least one microphone 170C. In some other embodiments, the electronic device 100 may be provided with two microphones 170C, which may also implement a noise reduction function in addition to collecting sound signals. In some other embodiments, the electronic device 100 can also be provided with three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and realize directional recording functions, etc.
  • the earphone interface 170D is used for connecting wired earphones.
  • the earphone interface 170D can be a USB interface 130, or a 3.5mm open mobile terminal platform (OMTP) standard interface, or a cellular telecommunications industry association of the USA (CTIA) standard interface.
  • OMTP open mobile terminal platform
  • CTIA cellular telecommunications industry association of the USA
  • any audio device in the audio module 170 can simultaneously play audio data from multiple applications.
  • the user can set the audio of the video clip and the audio of the conference call to be played simultaneously through the earphone.
  • the fingerprint sensor 180H is used to collect fingerprints.
  • the electronic device 100 can use the collected fingerprint characteristics to implement fingerprint unlocking, access to application locks, take pictures with fingerprints, answer incoming calls with fingerprints, and the like.
  • Touch sensor 180K also known as "touch panel”.
  • the touch sensor 180K can be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, also called a “touch screen”.
  • the touch sensor 180K is used to detect a touch operation acting on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • Visual output related to the touch operation can be provided through the display screen 194 .
  • the touch sensor 180K may also be disposed on the surface of the electronic device 100 , which is different from the position of the display screen 194 .
  • the keys 190 include a power key, a volume key and the like.
  • the key 190 may be a mechanical key. It can also be a touch button.
  • the electronic device 100 can receive key input and generate key signal input related to user settings and function control of the electronic device 100 .
  • the indicator 192 can be an indicator light, and can be used to indicate charging status, power change, and can also be used to indicate messages, missed calls, notifications, and the like.
  • the operating system of the electronic device 100 may include but not limited to Operating systems such as Harmony, which are not limited in this embodiment of the present application.
  • Operating systems such as Harmony
  • the following will have The PC of the system is taken as an example to illustrate the software structure of the electronic device 100 .
  • FIG. 4 is a block diagram of a software structure of an electronic device 100 provided by an embodiment of the present application.
  • the system is generally divided into kernel mode (kernel model) and user mode (user model).
  • kernel mode and the user mode can run on different privilege layers of the central processing unit (CPU), for example, the kernel mode (kernel model) can run on the 0th layer of the CPU, and the user mode can run on the CPU's first layer. 3 floors.
  • Each layer of the system consists of several components, which as a whole, The operation of the system depends on the call of the upper layer components to the lower layer components.
  • Each layer of components has a fixed interface for the upper layer to call. If the upper layer wants to change the permission operation, it needs to make a request to the lower layer.
  • the application layer can include music applications, video applications, teleconferencing applications, game applications, etc., as well as multi-screen interactive applications such as translation and speech-to-text applications, and smart voice applications.
  • the embodiment does not limit this.
  • the application layer may call an application programming interface (application programming interface, API) in its corresponding subsystem (such as Win32, POSIX, OS/2, etc.).
  • API application programming interface
  • the application layer can call the system service function through the API, and then call the corresponding service, while the application programs are kept relatively isolated, and the communication between them needs to be completed through the system.
  • the operating system can provide some basic inter-process communication to support the interoperability between applications in the application layer.
  • the subsystem can convert API functions into Native APIs to achieve compatibility with applications.
  • the function call in the Native API is converted into a system service function call and enters the kernel mode, and is further passed down to realize the corresponding function.
  • the kernel mode of the system can realize the basic mechanism of the operating system, and all the core codes run in the kernel mode, and these codes will not be subject to malicious attacks.
  • Applications running in user mode are the least secure and vulnerable, so application permissions are limited. If the application performs some actions such as direct access to physical memory, it needs to make requests to different executive components in kernel mode.
  • the kernel mode of the system includes basic operating system primitives and functions, such as drivers, executive components, and so on.
  • the kernel mode can provide some functions and semantics that can be directly called by the application program or the kernel driver of the application layer, such as input/output (input/output, I/O) manager, object manager, process management manager, virtual memory manager, configuration manager, and other components.
  • I/O input/output
  • object manager object manager
  • process management manager virtual memory manager
  • configuration manager and other components.
  • Different managers are used to manage different objects.
  • an object manager can be used to manage objects in an executive, which will not be repeated here.
  • HAL hardware abstraction layer
  • Hardware devices can include microphones, speakers, mice, keyboards, monitors, disks, printers, and networks, etc. Different hardware devices can be handled by the operating system in a consistent manner through their respective drivers.
  • the system kernel accepts the request of the application program and communicates with the hardware device; on the other hand, the hardware device sends a signal to the computer, and the driver program, together with the system kernel, transmits the signal to the corresponding application program after receiving the signal.
  • the music application when the user clicks the icon of the music application to enter the running interface of the music application, the music application can call the system service function through the API, and then call the related The driver creates playback resources for the music application; after the user clicks the play button to start playing music, and sets the audio to be output through the earphone, the audio data can be transmitted to the audio driver through the interface, and the audio driver can start to pass through the earphone and other hardware according to the created playback resource.
  • the device plays audio, and the subsequent audio playback process will not be repeated.
  • FIG. 5 is a schematic diagram of an example of an audio splitting process provided by an embodiment of the present application.
  • the processing procedure in FIG. 5 may be based on the process shown in FIG. 4
  • the audio service module can be implemented by the audio service (such as Huawei audio service provided by Huawei) program provided by the manufacturer pre-installed by the user, and the audio processing module can be developed by the audio software provided by the manufacturer pre-installed by the user Toolkit (such as Huawei audio software development kit, Huawei audio SDK provided by Huawei) to achieve.
  • the audio service such as Huawei audio service provided by Huawei
  • the audio processing module can be developed by the audio software provided by the manufacturer pre-installed by the user Toolkit (such as Huawei audio software development kit, Huawei audio SDK provided by Huawei) to achieve.
  • the audio service module provided by the Huawei audio service (Huawei audio service) program can be used to collect process-related information when the playback application is started, and send the process information to the audio processing objects (audio processing objects, APO) module.
  • the APO module can be understood as an audio driver in the user mode.
  • the APO module can collect audio data delivered by each playback application and allocate playback resources for each application.
  • the Huawei audio SDK can be understood as an interface for obtaining audio data of playback applications, or called "audio kit".
  • This interface can provide applications at the application layer (such as smart voice applications, multi-screen interactive applications, etc.) with the ability to obtain audio data corresponding to a specified process, and can obtain the audio data corresponding to the specified process in real time.
  • Huawei audio service and Huawei audio SDK are just one name, and they can also be called other names, and they can also be used to achieve the same function, which is not limited in this application.
  • Huawei audio service Huawei audio service
  • Huawei audio software development kit Huawei audio SDK
  • the service environment will not be described in detail in the embodiment of this application.
  • modules in the area above the dotted line in FIG. 5 correspond to software modules in the user mode in FIG. repeat.
  • the method 500 can include the following steps:
  • the user opens one or more playback applications, and the playback applications call WASAPI.
  • the opening of one or more playback applications in step 501 can be understood as the user clicking the icon of each application in the one or more playback applications to enter the running interface of the application. At this time, the user may not click Even if the play button does not start playing the audio of the application, the application can also be opened through a shortcut and start playing the audio at the same time, which is not limited in this embodiment of the present application.
  • the playing application may include any one or more of multiple applications such as conference calls, video applications, music applications, and game applications.
  • the music application in step 501, the user can click the icon of the music application to enter the running interface of the music application, and in response to the user's operation, the music application can call System interface, such as WASAPI, etc.
  • WASAPI may designate a corresponding process for each application, or WASAPI may call multiple other interfaces and functions of the system to obtain the pid of each application, which is not limited in this embodiment of the present application.
  • Table 1 shows an example of a possible correspondence between an application name and a process identifier.
  • the system can determine the pid of each currently running application through WASAPI. For example, the music application corresponds to pid 1, the video application corresponds to pid 2, and the teleconferencing application corresponds to pid 3, so we will not give examples here.
  • the audio service module obtains the pid of each application.
  • the audio service module can be used with The WASAPI of the system establishes a connection, and the audio service module can access the process information of each application through WASAPI.
  • the timing of step 503 may be that when the user opens the application and WASAPI obtains the pid corresponding to the application, WASAPI can automatically send the pid corresponding to the application to the audio service module; or The audio service module may request to obtain the pid corresponding to the application after the user clicks to play the audio of the application, which is not limited in this embodiment of the present application.
  • the audio service module can also obtain more other information related to the application, such as more attribute information such as the application name, and application status information, etc., which is not limited in this embodiment of the present application.
  • the state information of the application may be used to indicate the playing state of the current application: playing or stopping playing.
  • the audio service module sends information including the pid of each application to the APO module.
  • the APO module receives the information of the pid of each application and determines the pid of each application.
  • step 503 may include different implementation manners.
  • the audio service module can obtain the playing state information of each application, after obtaining the playing state information of each application, when it is determined according to the playing state information of each application that the application is In the playing state, send the pid of the application to the APO module. For the application that is not playing, the pid of the application may not be sent.
  • the audio service module After the audio service module obtains the pid information of each application, it automatically sends the pid corresponding to the application to the APO module, which is not limited in this embodiment of the present application.
  • the APO module may request the audio service module to obtain the information of the pid of each application, and the audio service module shall respond to the request of the APO module, and then perform the process of step 504, which will include the pid information of each application.
  • the pid information is sent to the APO module, and the embodiment of the present application does not limit the timing and manner for the audio service module to send the pid information of each application to the APO module.
  • the APO module creates a playback resource for each application being played, and establishes a one-to-one correspondence between the pid of each application and the playback resource.
  • Fig. 6 is a schematic diagram of an audio stream processing procedure provided by an embodiment of the present application. As shown in Figure 6, in the process from the application program layer to the APO module, each application starts to run, and the APO module can create playback resources for each application.
  • the APO module can be divided into three layers, from top to bottom are stream effects (stream effects, SFX) layer, mode effects (mode effects, MFX) layer and endpoint effects (endpoint effects, EFX) layer.
  • stream effects stream effects
  • SFX mode effects
  • MFX mode effects
  • EFX endpoint effects
  • the SFX layer can uniquely allocate an SFX object to each application, and the SFX object can receive the audio data issued by the application. Going down to the MFX layer, all applied SFX objects will be mixed together to form mixed audio.
  • the embodiment of the present application does not repeat the process of forming mixed audio data between the MFX layer and the EFX layer.
  • the audio service module introduced in step 503 can be The WASAPI connection of the system refers to the establishment of a connection between the audio service module and the SFX layer.
  • the audio service module can access each SFX object of the SFX layer through WASAPI, and then determine the process information of each application.
  • the APO module can create a playback resource for the music application, and the playback resource can be understood as an SFX object of the music application, and the APO module combines each Information about the pid of the application, and establish a one-to-one correspondence between the pid and the SFX.
  • Table 2 shows an example of the correspondence between possible applications and playback resources.
  • APO can allocate SFX to each currently running application.
  • the music application corresponds to SFX 1
  • the video application corresponds to SFX 2
  • the conference call application corresponds to SFX 3.
  • the APO module sends information about the established one-to-one correspondence between the pid and the SFX to the audio processing module.
  • the APO module may send the information of the one-to-one correspondence between the pid and the SFX to the audio processing module (such as Huawei audio SDK) in many possible ways.
  • the APO module can periodically send information about the one-to-one correspondence between pid and SFX to the audio processing module (such as Huawei audio SDK), or the APO module receives a request from the audio processing
  • the processing module sends the information of the one-to-one correspondence between the pid and the SFX, and the embodiment of the present application does not limit the timing and manner for the APO module to send the information of the one-to-one correspondence between the pid and the SFX to the audio processing module.
  • the audio processing module serves as an interface for accessing audio data for smart voice applications, multi-screen interactive applications, etc., and can store the one-to-one correspondence between the pid and the SFX.
  • the interface can also accept request messages sent by smart voice applications, multi-screen interactive applications, etc., and obtain audio data corresponding to specified processes for smart voice applications, multi-screen interactive applications, etc.
  • the audio processing module receives a request message from a smart voice application or a multi-screen interactive application, where the request message includes the pid of the first application, and the request message is used to request to acquire audio data of the first application.
  • the first application here may be one or more applications, or in other words, the audio data of multiple first applications is acquired simultaneously, which is not limited in this embodiment of the present application.
  • the smart voice application can provide the audio processing module (such as the Huawei audio SDK provided by Huawei)
  • the request message is sent to request to obtain the audio data of the conference call.
  • the first application may be a conference call application.
  • the translation software can send the request message to Huawei audio SDK to request to obtain the audio data of the video being played.
  • the first application may be a video application.
  • the multi-screen interactive application can send the request to an audio processing module (such as the Huawei audio SDK provided by Huawei) Message to request audio data for the video being played.
  • the first application may be a video application.
  • an image frame of the video needs to be obtained. The embodiment of the present application does not limit the process and method of obtaining the image frame.
  • the first application is multiple applications, that is, the audio data of the conference call and the audio data of the video application are simultaneously acquired.
  • the audio processing module receives the request message from the smart voice application and When translating the request message of the software, the audio processing module can determine SFX3 according to the pid of the conference call, and obtain the audio data of SFX3, and determine SFX2 according to the pid of the video application, obtain the audio data of SFX2, and perform corresponding processing, namely It can realize simultaneous acquisition of audio data of multiple different applications, and realize splitting of audio data.
  • the audio processing module determines the SFX of the first application according to the pid of the first application, and acquires audio data from the SFX of the first application.
  • the audio processing module (such as the Huawei audio SDK provided by Huawei) obtains the pid of one or more applications included in the request message, determines the SFX corresponding to the application according to the pid of each application, and directly obtains the SFX from the SFX. App's audio data.
  • the audio data stream of the system's APO module is a mixture of audio data from multiple applications, and the APO module cannot know which application each audio data belongs to.
  • the audio processing module returns the acquired audio data of the first application to the smart voice application or the multi-screen interactive application.
  • step 507 when the user wishes to convert the audio of the current conference call into text and record it by means of the speech-to-text function of the smart voice application, then in step 509, the audio processing module will obtain the phone The audio data of the conference application is returned to the smart voice application, so that the smart voice application can convert the current audio data into text.
  • step 507 when the user expects to convert the English audio of the video being played into Chinese subtitles by means of the translation service of the translation software, then in step 509, the audio processing module returns the acquired audio data of the video application to the translator software, so that the translation software can convert the current English audio into Chinese subtitles and display them on the screen in real time.
  • step 507 the user expects to switch the video of the PC to other electronic devices such as a tablet or a smart screen for display by means of a multi-screen interactive application
  • step 509 the audio processing module will obtain the audio data of the video application and image display data, etc. are returned to the multi-screen interactive application, so that the multi-screen interactive application can switch to other electronic devices such as tablets and smart screens according to the audio and images corresponding to the video of the PC.
  • the audio service module obtains the process information of each application and sends the process information to APO Module, so that the APO module can accurately match the corresponding SFX according to the process information of each application, that is, specify a unique SFX for each application, and then users can accurately obtain the audio data corresponding to each application in multiple applications .
  • This process does not require users to switch audio drivers and audio devices through tedious operations, which simplifies the operation process.
  • the method can simultaneously split the audio data of multiple application programs from the audio data of multiple application programs.
  • the audio data of each application needs to be obtained at the same time, it can be provided to the user.
  • this process does not require additional virtual audio drivers and virtual audio devices, and does not affect the use of special sound effects, thereby improving user experience.
  • FIG. 7 is a sequence diagram of an example of audio stream processing provided by the embodiment of the present application. As shown in FIG. 7 , the processing procedure 700 includes different stages before playing—playing started—playing stopped, and the processes involved in each stage are introduced below.
  • the audio service module can be implemented by an audio service (such as Huawei audio service provided by Huawei) program provided by the manufacturer pre-installed by the user, and the audio processing module can be pre-installed by the user.
  • the audio software development kit (such as the Huawei audio SDK provided by Huawei) provided by the manufacturer can be implemented, and the details will not be described later.
  • the user opens the player application and calls The interface of the system specifies the corresponding process for each application, that is, determines the pid of each application.
  • the playing application may include any one or more of conference calls, video applications, music applications, and game applications.
  • the music application in the first stage, the user can click the icon of the music application to enter the running interface of the music application, and in response to the user's operation, the music application can call System interface, such as WASAPI, etc.
  • this step 701 can refer to the process of step 502 in the aforementioned method 500,
  • the interface of the system can determine the pid listed in Table 1 for each application, which will not be repeated here.
  • the system establishes a connection with the audio service module.
  • the user runs the playback application, it can trigger
  • the system establishes a connection with the audio service module provided by the embodiment of this application through WASAPI, and related information can be exchanged in the subsequent process.
  • the audio service module creates a control channel (control pipe) and a data channel (data pipe) for the APO.
  • the system triggers APO to create playback resources through WASAPI.
  • creating a playback resource here can be understood as creating a new SFX.
  • the user may click a play button of the music application to start playing music.
  • the system notifies the audio service module that it has started playing through WASAPI.
  • the audio service module acquires the process identifier.
  • the audio service module can also obtain more other information related to the application, such as more attribute information such as the application name, which is not limited in this embodiment of the present application.
  • the audio service module sends the process identification information to the APO.
  • the audio service module may send information including the process ID to the APO through the control pipeline created in step 703, for example, the audio service module may send the information of the process ID and application name listed in Table 1 to the APO. It should be understood that the embodiment of the present application does not limit the number of process identifiers or the number of applications.
  • the music application starts to move to the bottom layer
  • the system sends audio data.
  • enabling the SFX can be understood as assigning the newly created SFX in step 704 to the reference application.
  • the APO determines the correspondence between the process ID and the SFX.
  • the audio service module sends each acquired application and the process identification corresponding to each application to the APO, and the APO can pair one SFX for each process.
  • the APO can determine the one-to-one correspondence between the processes and the SFX.
  • this step 711 can refer to the process of step 505 in the aforementioned method 500, and the APO module can determine the corresponding relationship as listed in Table 2 according to the pid and SFX of each application, which will not be repeated here.
  • the APO sends information about the correspondence between the process identifier and the SFX to the audio processing module.
  • the audio processing module (for example, Huawei audio service provided by Huawei) obtains the correspondence between the process identifier and the SFX.
  • the smart voice application can send a request to the audio processing module to obtain the audio data of the target application.
  • the audio processing module can further instruct the SFX of the target application according to the pid of the target application, and then go to the corresponding SFX to obtain audio data, which is the audio data of the target application.
  • step 706 and step 709 have been tested to be less than 1 millisecond, and this process can ensure the real-time performance of acquiring audio data.
  • the smart voice application or the multi-screen interactive application requests the audio processing module to acquire the audio data of the first application. It should be understood that the request message includes the pid of the first application.
  • the audio processing module determines the pid of the first application, and determines the SFX of the first application according to the pid of the first application.
  • the audio processing module acquires audio data from the SFX of the first application.
  • the audio processing module returns the audio data of the first application to the smart voice application or the multi-screen interactive application.
  • step 713 when the user wishes to convert the audio of the current conference call into text and record it by means of the speech-to-text function of the smart voice application, then in step 716, the audio processing module will obtain the The audio data of the teleconferencing application is returned to the smart voice application, so that the smart voice application can convert the current audio data into text.
  • step 713 when the user expects to convert the English audio of the video being played into Chinese subtitles by means of the translation service of the translation software, then in step 716, the audio processing module returns the acquired audio data of the video application to the translator software, so that the translation software can convert the current English audio into Chinese subtitles and display them on the screen in real time.
  • step 713 the user expects to switch the video of the PC to other electronic devices such as a tablet or a smart screen for display by means of a multi-screen interactive application
  • step 716 the audio processing module will obtain the audio data of the video application and image display data, etc. are returned to the multi-screen interactive application, so that the multi-screen interactive application can switch to other electronic devices such as tablets and smart screens according to the audio and images corresponding to the video of the PC.
  • the audio processing module will obtain the audio data of the video application and image display data, etc. are returned to the multi-screen interactive application, so that the multi-screen interactive application can switch to other electronic devices such as tablets and smart screens according to the audio and images corresponding to the video of the PC.
  • the audio service module obtains the process information of each application and sends the process information to APO Module, so that the APO module can accurately match the corresponding SFX according to the process information of each application, that is, specify a unique SFX for each application, and then users can accurately obtain the audio data corresponding to each application in multiple applications .
  • This process does not require users to switch audio drivers and audio devices through tedious operations, which simplifies the operation process.
  • the method can simultaneously split the audio data of multiple application programs from the audio data of multiple application programs.
  • the audio data of each application needs to be obtained at the same time, it can be provided to the user.
  • this process does not require additional virtual audio drivers and virtual audio devices, and does not affect the use of special sound effects, thereby improving user experience.
  • the application when the user stops playback, the application can send the underlying The system sends a command to stop playing, The system can call related interfaces to perform interrupt operations, that is, interrupt playback resources.
  • the application can send the underlying
  • the system sends an instruction to terminate the audio service,
  • the system can call the relevant interface to execute the termination operation, and at the same time delete the previously established communication channels, such as control channels and data channels.
  • the electronic device includes hardware and/or software modules corresponding to each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or by computer software driving hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions in combination with the embodiments for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
  • the functional modules of the electronic device may be divided according to the above method example.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above integrated modules may be implemented in the form of hardware. It should be noted that the division of modules in this embodiment is schematic, and is only a logical function division, and there may be other division methods in actual implementation.
  • the electronic device may include: a display unit, a detection unit, a processing unit, and the like.
  • the display unit, the detection unit, and the processing unit cooperate with each other, and may be used to support the electronic device to perform the steps described above, and/or be used in other processes of the technologies described herein.
  • the electronic device provided in this embodiment is used to implement the above audio splitting method, so the same effect as the above implementation method can be achieved.
  • the electronic device may include a processing module, a memory module and a communication module.
  • the processing module can be used to control and manage the actions of the electronic device, for example, it can be used to support the electronic device to execute the steps performed by the above-mentioned display unit, detection unit and processing unit.
  • the memory module can be used to support electronic devices to execute stored program codes and data, and the like.
  • the communication module can be used to support the communication between the electronic device and other devices.
  • the processing module may be a processor or a controller. It can implement or execute the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor can also be a combination of computing functions, such as a combination of one or more microprocessors, a combination of digital signal processing (digital signal processing, DSP) and a microprocessor, and the like.
  • the storage module may be a memory.
  • the communication module may be a device that interacts with other electronic devices, such as a radio frequency circuit, a Bluetooth chip, and a Wi-Fi chip.
  • the electronic device involved in this embodiment may be a device having the structure shown in FIG. 3 .
  • This embodiment also provides a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are run on the electronic device, the electronic device executes the above-mentioned relevant method steps to realize the steps in the above-mentioned embodiments.
  • the method of audio streaming is also provided.
  • This embodiment also provides a computer program product, which, when running on a computer, causes the computer to execute the above related steps, so as to implement the audio splitting method in the above embodiment.
  • an embodiment of the present application also provides a device, which may specifically be a chip, a component or a module, and the device may include a connected processor and a memory; wherein the memory is used to store computer-executable instructions, and when the device is running, The processor can execute the computer-executable instructions stored in the memory, so that the chip executes the audio splitting method in the above method embodiments.
  • the electronic device, computer-readable storage medium, computer program product or chip provided in this embodiment is all used to execute the corresponding method provided above, therefore, the beneficial effects it can achieve can refer to the above-mentioned The beneficial effects of the corresponding method will not be repeated here.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of modules or units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or It may be integrated into another device, or some features may be omitted, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • a unit described as a separate component may or may not be physically separated, and a component shown as a unit may be one physical unit or multiple physical units, which may be located in one place or distributed to multiple different places. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • an integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a readable storage medium.
  • the technical solution of the embodiment of the present application is essentially or the part that contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, and the software product is stored in a storage medium Among them, several instructions are included to make a device (which may be a single-chip microcomputer, a chip, etc.) or a processor (processor) execute all or part of the steps of the methods in various embodiments of the present application.
  • the aforementioned storage medium includes: various media that can store program codes such as U disk, mobile hard disk, read only memory (ROM), random access memory (random access memory, RAM), magnetic disk or optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Stored Programmes (AREA)

Abstract

本申请提供了一种音频分流的方法及电子设备,该方法可以应用于PC等设备,当多个应用程序同时使用一个音频器件播放音频时,该方法可以在APO模块混合多个应用的音频数据之前,由音频服务模块获取每个应用的进程信息,并将进程信息发送给APO模块,便于APO模块能够精准地根据每个应用的进程信息为每个应用程序指定唯一的播放资源SFX,进而用户可以准确地获取多个应用程序中每一个应用程序对应的音频数据。该过程提供给用户分布式音频服务的能力,不需要用户切换音频驱动和音频器件,实现快速且准确地完成音频数据分流。再者,该过程不需要额外增加虚拟音频驱动和虚拟音频器件,不会影响特殊音效的使用,提高了用户的使用体验。

Description

一种音频分流的方法及电子设备
本申请要求于2021年05月28日提交国家知识产权局、申请号为202110598410.0、申请名称为“一种音频分流的方法及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电子技术领域,尤其涉及一种音频分流的方法及电子设备。
背景技术
随着电子技术和电子设备的发展,用户可以在一个电子设备上同时使用该电子设备的不同功能,例如,用户可能使用个人电脑(personal computer,PC)进行电话会议,同时使用该PC观看视频、播放音乐等。具体地,PC具有耳机、麦克风和扬声器等不同的音频终端器件(audio endpoint devices),可以称为该PC的“音频器件”或“endpoint”。其中,该音频器件可以不限于为PC的内置器件或外接器件,例如耳机通过耳机接口连接PC,耳机可以作为PC的音频播放器件,扬声器可以为PC的内置器件。
在同时使用PC进行电话会议和观看视频的场景下,用户可能通过PC的耳机接口外接耳机,并使用该耳机同时接听电话会议以及收听该视频对应的音频。一种可能的场景中,用户可能期望通过某翻译软件对当前播放的视频的字幕进行翻译,例如正在播放的视频片段为英文字幕,通过运行某翻译软件获取该视频的音频,可以实时对获取的英文音频进行翻译,将翻译后对应的中文字幕显示在视频播放窗口中,供用户查看。
对于具有
Figure PCTCN2022084069-appb-000001
系统的PC,当用户通过一个相同的音频器件(例如外接耳机)同时播放电话会议的音频和当前视频片段的音频时,如果用户再需要使用翻译软件对当前视频的音频进行实时翻译,该过程中,翻译软件获取的是包括电话会议的音频和当前视频片段的音频的混合音频,影响了翻译的准确性。如果用户同时使用更多的播放类应用时,该翻译软件无法从包括更多应用的混合音频中获取该当前视频的音频,影响了翻译进程和翻译的准确性,降低了用户体验。
或者,用户通过智慧语音等应用对电话会议的音频进行语音转文字等操作,该过程中,智慧语音等应用获取的也是包括电话会议的音频和当前视频片段的音频的混合音频,影响了语音转文字的准确性。
发明内容
本申请提供一种音频分流的方法及电子设备,该方法提供给用户分布式音频服务的能力,不需要用户切换音频驱动和音频器件,实现快速且准确地完成音频数据分流。再者,该过程不需要额外增加虚拟音频驱动和虚拟音频器件,不会影响特殊音效的使用,提高了用户的使用体验。
第一方面,提供了一种音频分流的方法,其特征在于,应用于包括一个或多个音频器件的电子设备,所述方法包括:接收用户的第一操作,响应于所述第一操作,所述电子设备通过第一音频器件播放M个应用对应的音频,其中,所述第一音频器件是所述一个或多个音频器件中的任意一个,所述M个应用中的每一个应用是能够输出对应的音频的应 用,M为大于或等于2的整数;接收用户的第二操作,所述第二操作用于请求获取第一应用的音频数据,所述第一应用为所述M个应用中的任意一个应用;响应于所述第二操作,从所述M个应用关联的播放资源中确定所述第一应用的播放资源;从所述第一应用的播放资源上,获取所述第一应用的音频数据。
应理解,本申请实施例的“M个应用”可以为播放类应用,示例性的,该播放类应用可以包括电话会议、视频应用、音乐应用和游戏应用等多个应用中的任意一种或多种。
可选地,电子设备可以包括麦克风、耳机和扬声器等不同的音频器件(endpoint)。具体地,以PC为例,对于PC输出的音频,用户可以设置PC通过耳机、扬声器等中的任意一种或多种音频器件播放来源于不同应用的音频。本申请主要针对PC通过耳机、扬声器等中的任意一种音频器件播放不同应用的音频的场景。
可选地,本申请提供的音频分流的处理过程可以基于
Figure PCTCN2022084069-appb-000002
系统的软件结构,且还需要额外构建音频服务平台,该音频服务平台可以通过音频服务相关的软件开发工具包来实现,例如该音频服务相关的软件开发工具包可以提供音频服务模块和音频处理模块。
一种可能的实现方式中,音频服务模块可以由用户预先安装华为的厂商提供的音频服务(例如华为提供的Huawei audio service)程序来实现,音频处理模块可以由用户预先安装华为的厂商提供的音频软件开发工具包(例如华为提供的Huawei audio software development kit,Huawei audio SDK)来实现。
一种可能的场景中,用户可以借助于智慧语音类应用的语音转文字功能,期望将当前电话会议的音频转换为文字记录下来时,那么电子设备的音频处理模块可以将获取的电话会议应用的音频数据返回给智慧语音类应用,以使得智慧语音类应用可以将当前的音频数据转化为文字。
或者,用户可以借助于翻译软件的翻译服务,期望将正在播放的视频的英文音频转化为中文字幕时,那么电子设备的音频处理模块可以将获取的视频应用的音频数据返回给该翻译软件,以使得该翻译软件可以实时将当前的英文音频转化为中文字幕显示在屏幕上。
又或者,用户可以借助于多屏互动类应用,期望将该PC的视频切换到平板、智慧屏等其他电子设备显示,那么电子设备的音频处理模块可以将获取的视频应用的音频数据和图像显示数据等返回给该多屏互动类应用,以使得该多屏互动类应用可以根据PC的视频对应的音频和图像都切换到平板、智慧屏等其他的电子设备,此处不再一一举例。
通过上述方法,对于多个应用程序同时使用一个音频器件播放音频的过程,在电子设备的音频处理对象(audio processing objects,APO)模块混合多个应用的音频数据之前,由音频服务模块获取每个应用的进程信息,并将进程信息发送给APO模块,便于APO模块能够精准地根据每个应用的进程信息匹配对应的SFX,即为每个应用程序指定唯一的SFX,进而用户可以准确地获取多个应用程序中每一个应用程序对应的音频数据。该过程不需要用户通过繁琐的操作切换音频驱动和音频器件,简化了操作流程。
此外,该方法能够实现从多个应用程序的音频数据中同时分流出多个应用程序的音频数据,当存在多个应用且同时需要获取每一个应用的音频数据的时候,可以提供给用户分布式音频服务的能力,快速且准确地完成音频数据分流。再者,该过程不需要额外增加虚拟音频驱动和虚拟音频器件,不会影响特殊音效的使用,提高了用户的使用体验。
可选地,本申请中该第一应用可以为多个应用,即同时获取电话会议的音频数据和视 频应用的音频数据。示例性的,当用户通过智慧语音类应用的语音转文字功能获取当前电话会议的音频,同时通过翻译软件等获取正在播放的视频的音频,当音频处理模块收到智慧语音类应用的请求消息和翻译软件的请求消息时,该音频处理模块可以根据电话会议的pid确定播放资源SFX3,并获取SFX3的音频数据,以及根据视频应用的pid确定播放资源SFX2,并获取SFX2的音频数据,执行相应的处理过程,即可以实现同时获取多个不同应用的音频数据,实现音频数据的分流。
结合第一方面,在第一方面的某些实现方式中,所述方法还包括:确定所述M个应用对应的M个进程标识和所述M个应用对应的播放资源之间的映射关系,其中,所述M个应用中的每一个应用对应一个进程标识,且所述M个应用中的每一个应用对应一个播放资源。
可选地,当用户运行了多个应用时,电子设备可以通过WASAPI确定当前运行的每一个应用的pid。例如音乐应用对应pid 1,视频应用对应pid 2,电话会议应用对应pid 3,由APO模块为每个正在播放的应用创建播放资源SFX,且建立每个应用的pid和播放资源SFX之间的一一对应关系。
一种可能的实现方式中,APO模块可以向音频处理模块(例如Huawei audio SDK)发送pid和SFX之间的一一对应关系的信息式。例如,APO模块可以周期性地向音频处理模块(例如Huawei audio SDK)发送pid和SFX之间的一一对应关系的信息,或者,APO模块接收音频处理模块的请求,响应于该请求再向音频处理模块发送pid和SFX之间的一一对应关系的信息,本申请对APO模块向音频处理模块发送pid和SFX之间的一一对应关系的信息的时机和方式不作限定。
结合第一方面和上述实现方式,在第一方面的某些实现方式中,所述从所述M个应用对应的播放资源中确定所述第一应用的播放资源,包括:确定所述第一应用的进程标识;根据所述第一应用的进程标识和所述映射关系,从所述M个播放资源中确定所述第一应用的播放资源。
结合第一方面和上述实现方式,在第一方面的某些实现方式中,所述方法还包括:所述电子设备运行所述M个应用,并为所述M个应用分配M个播放资源;获取所述M个应用对应的所述M个进程标识。
结合第一方面和上述实现方式,在第一方面的某些实现方式中,所述方法还包括:接收用户的第三操作,所述第三操作用于请求中断播放所述第一应用的音频;响应于所述第三操作,所述第一应用的播放资源暂停接收所述第一应用的音频数据。
结合第一方面和上述实现方式,在第一方面的某些实现方式中,所述方法还包括:接收用户的第四操作,所述第四操作用于关闭所述第一应用;响应于所述第四操作,所述第一应用的播放资源暂停接收所述第一应用的音频数据且释放所述第一应用的播放资源。
结合第一方面和上述实现方式,在第一方面的某些实现方式中,所述M个应用包括音乐应用、视频应用、游戏应用、会议应用中的任意应用。
通过上述方法,对于多个应用程序同时使用一个音频器件播放音频的过程,在APO模块混合多个应用的音频数据之前,由音频服务模块获取每个应用的进程信息,并将进程信息发送给APO模块,便于APO模块能够精准地根据每个应用的进程信息匹配对应的SFX,即为每个应用程序指定唯一的SFX,进而用户可以准确地获取多个应用程序中每一 个应用程序对应的音频数据。该过程不需要用户通过繁琐的操作切换音频驱动和音频器件,简化了操作流程。
此外,该方法能够实现从多个应用程序的音频数据中同时分流出多个应用程序的音频数据,当存在多个应用且同时需要获取每一个应用的音频数据的时候,可以提供给用户分布式音频服务的能力,快速且准确地完成音频数据分流。再者,该过程不需要额外增加虚拟音频驱动和虚拟音频器件,不会影响特殊音效的使用,提高了用户的使用体验。
第二方面,提供了一种电子设备,其特征在于,包括:一个或多个音频器件;一个或多个处理器;一个或多个存储器;安装有多个应用程序的模块;所述存储器存储有一个或多个程序,当所述一个或者多个程序被所述处理器执行时,使得所述电子设备执行以下步骤:接收用户的第一操作,响应于所述第一操作,通过第一音频器件播放M个应用对应的音频,其中,所述第一音频器件是所述一个或多个音频器件中的任意一个,所述M个应用中的每一个应用是能够输出对应的音频的应用,M为大于或等于2的整数;接收用户的第二操作,所述第二操作用于请求获取第一应用的音频数据,所述第一应用为所述M个应用中的任意一个应用;响应于所述第二操作,从所述M个应用关联的播放资源中确定所述第一应用的播放资源;从所述第一应用的播放资源上,获取所述第一应用的音频数据。
结合第二方面,在第二方面的某些实现方式中,当所述一个或者多个程序被所述处理器执行时,使得所述电子设备执行以下步骤:确定所述M个应用对应的M个进程标识和所述M个应用对应的播放资源之间的映射关系,其中,所述M个应用中的每一个应用对应一个进程标识,且所述M个应用中的每一个应用对应一个播放资源。
结合第二方面和上述实现方式,在第二方面的某些实现方式中,当所述一个或者多个程序被所述处理器执行时,使得所述电子设备执行以下步骤:确定所述第一应用的进程标识;根据所述第一应用的进程标识和所述映射关系,从所述M个播放资源中确定所述第一应用的播放资源。
结合第二方面和上述实现方式,在第二方面的某些实现方式中,当所述一个或者多个程序被所述处理器执行时,使得所述电子设备执行以下步骤:运行所述M个应用,并为所述M个应用分配M个播放资源;获取所述M个应用对应的所述M个进程标识。
结合第二方面和上述实现方式,在第二方面的某些实现方式中,当所述一个或者多个程序被所述处理器执行时,使得所述电子设备执行以下步骤:接收用户的第三操作,所述第三操作用于请求中断播放所述第一应用的音频;响应于所述第三操作,所述第一应用的播放资源暂停接收所述第一应用的音频数据。
结合第二方面和上述实现方式,在第二方面的某些实现方式中,当所述一个或者多个程序被所述处理器执行时,使得所述电子设备执行以下步骤:接收用户的第四操作,所述第四操作用于关闭所述第一应用;响应于所述第四操作,所述第一应用的播放资源暂停接收所述第一应用的音频数据且释放所述第一应用的播放资源。
结合第二方面和上述实现方式,在第二方面的某些实现方式中,所述M个应用包括音乐应用、视频应用、游戏应用、会议应用中的任意应用。
第三方面,提供了一种电子设备,包括:一个或多个处理器;一个或多个存储器;安装有多个应用程序的模块;该存储器存储有一个或多个程序,当该一个或者多个程序被该 处理器执行时,使得该电子设备执行如上述任一方面任一项可能的实现中执行的步骤。
第四方面,提供了一种电子设备上的图形用户界面,该电子设备具有显示屏、存储器、以及一个或多个处理器,该一个或多个处理器用于执行存储在该存储器中的一个或多个计算机程序,该图形用户界面包括电子设备执行上述任一方面任一项可能的音频分流的方法中显示的图形用户界面。
第五方面,提供了一种装置,该装置包含在电子设备中,该装置具有实现上述第一方面及上述第一方面的可能实现方式中电子设备行为的功能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括一个或多个与上述功能相对应的模块或单元。例如,显示模块或单元、检测模块或单元、处理模块或单元等。
第六方面,提供了一种计算机存储介质,包括计算机指令,当计算机指令在电子设备上运行时,使得电子设备执行上述任一方面中任一项可能的音频分流的方法。
第七方面,提供了一种计算机程序产品,当计算机程序产品在电子设备上运行时,使得电子设备执行上述任一方面中任一项可能的音频分流的方法。
附图说明
图1是一例用户在PC上使用多个应用的场景示意图。
图2是一例音频分流的过程示意图。
图3是本申请实施例提供的一例电子设备的结构示意图。
图4是本申请实施例提供的一例电子设备的软件结构框图。
图5是本申请实施例提供的一例音频分流过程的示意图。
图6是本申请实施例提供的一例音频流的处理过程示意图。
图7是本申请实施例提供的一例音频流的处理过程的时序图。
具体实施方式
图1是一例用户在PC上使用多个应用的场景示意图。如图1中的(a)图所示,用户使用PC 100播放视频片段,该视频在PC 100的显示屏上的窗口10中播放;同时用户使用电话会议软件进行电话会议,在PC 100的显示屏上还显示电话会议的窗口20。应理解,本申请实施例对播放视频的应用和电话会议的应用不作限定。
还应理解,PC 100可以包括麦克风、耳机和扬声器等不同的音频器件(endpoint)。具体地,对于PC 100输出的音频,用户可以设置PC 100通过耳机、扬声器等中的任意一种或多种音频器件播放来源于不同应用的音频。
示例性的,对于图1中的(a)图所示的场景,用户可以设置PC 100通过扬声器110播放电话会议过程中接收的音频,以及同时通过扬声器110播放该视频对应的音频。或者,可以设置PC 100通过外接耳机120播放电话会议过程中接收的音频,以及同时通过扬声器110播放该视频对应的音频。又或者,可以设置PC 100通过扬声器110播放该视频对应的音频,以及通过外接耳机120播放电话会议过程中接收的音频。
一种可能的场景中,如图1中的(b)图所示,如果用户设置PC 100通过外接耳机120获取电话会议过程中接收来源于电话号码“138-XXXX-0493”的音频,以及通过耳机120播放该视频对应的音频。换言之,图1中的(a)图和(b)图所示的场景中,PC都通过外接的耳机120同时输出电话会议的音频和正在播放的视频的音频。示例性的,当PC 100通过外接耳机120输出音频时,耳机120的名称可以记作“Synaptics HD AXX”。
应理解,在电话会议过程中,耳机120还可以接收当前用户的语音并通过麦克风输入到PC 100,例如耳机输入音频时麦克风名称可以记作“Synaptics HD BXX”,本申请实施例重点关注耳机120作为音频播放器件,输出PC 100的一个或多个播放类应用的音频的过程,对耳机120作为音频接收器件接收当前用户的语音的过程不作赘述。
对于图1中的(a)图和(b)图所示的场景,用户还可以再运行某翻译软件,通过该翻译软件实时获取当前播放视频的窗口10中的英文字幕,并对获取的英文音频进行翻译。但是该过程借助于微软音频对话的应用程序接口(The Windows audio session application programming interface,WASAPI),该翻译软件获取的是电话会议和视频的混合音频,无法做到精准的翻译。
或者,用户期望将当前电话会议的音频转换为文字记录下来,可以借助智慧语音的语音转文字功能,该过程中,智慧语音应用获取的也是是电话会议和视频的混合音频,降低了语音识别的准确率,影响了音频转换文字的效率,降低了用户体验。
又或者,当用户期望将PC 100上的视频切换到平板、智慧屏等其他电子设备上播放时,可以借助于多屏类应用,实现将视频的窗口10中的播放画面切换到平板、智慧屏等其他电子设备,但是该过程无法将视频对应的音频从混合音频中分离,即无法实现将视频的窗口10中的播放画面切换到平板、智慧屏等其他电子设备时,同时将该视频对应的音频也切换到平板、智慧屏等其他电子设备播放,降低了用户的多屏使用体验。
综上所述,如果用户在PC 100上同时使用更多数量的播放类应用,例如开启了多个音乐应用、游戏应用等,对于同一个的音频器件(endpoint),例如多个应用的音频都通过外接耳机播放,那么同一个endpoint采集的是来源于多个应用的混合音频,无法实现实时翻译、多屏切换等功能,降低了用户体验。
图2是一例音频分流的过程示意图。该方法主要借助于虚拟声卡技术,实现从多个应用的混合音频中提取某一个应用对应的音频。在介绍该虚拟声卡技术之前,用户首先需要在PC上安装虚拟音频驱动,安装后PC上自动生成虚拟声卡(voice meeter)。
应理解,假设PC当前正在运行的应用包括电话会议、视频应用和音乐应用,一般情况下,电话会议、视频应用和音乐应用都可以通过应用程序接口(application programming interface,API),调用系统默认的音频驱动,通过同一个音频器件(例如外接的耳机120)同时播放电话会议的音频、视频的音频和音乐应用的音频。
当用户期望将电话会议的音频从混合音频中分离出来时,用户可以手动设置电话会议的音频通过虚拟声卡进行播放。可选地,用户可以如图1中的(b)图所示,通过“应用音量和设备首选项”菜单中的相关选项,设置电话会议的音频挂载在虚拟音频驱动下,即通过虚拟声卡进行播放。将视频应用的音频和音乐应用的音频挂载在系统默认的音频驱动下,即通过系统默认的音频器件进行播放。该过程可以包括如图2所示的步骤:
步骤1,电话会议、视频应用和音乐应用等播放类应用通过API为每个应用分别指定相应的进程,即为每个应用分配进程标识(procedure identifier,pid)。
应理解,本申请实施例中“播放类应用”可以包括运行过程中有相应的音频输出的应用,例如音乐应用、视频应用、电话会议应用、游戏应用等,本申请实施例不再一一举例。
步骤2,用户启动智慧语音的语音转文字功能,触发智慧语音应用通过API请求获取每个正在运行的播放类应用对应的pid,且用户手动设置将电话会议的音频通过虚拟声卡 进行播放。
可选地,用户可以按照图1中的(b)图所示的方法,通过在“应用音量和设备首选项”菜单中将电话会议应用的输出修改为与虚拟声卡相关的音频器件,这里不再赘述。
应理解,本申请实施例除了智慧语音的语音转文字功能之外,还可以通过翻译、播放类应用投屏到其他电子设备的多屏互动类应用等触发通过API请求获取每个播放的应用对应的pid,本申请实施例对此不作限定。
步骤3,根据用户的设置,API确定电话会议对应的pid,并将电话会议的进程挂载到虚拟音频驱动,调用虚拟声卡实现电话会议音频的播放。
步骤4,通过API将音乐应用的进程、视频应用的进程挂载到系统默认的音频驱动,调用系统默认的音频器件播放音乐应用的音频、视频应用的音频。
步骤5,智慧语音应用请求获取虚拟音频驱动上电话会议对应的音频数据。
步骤6,返回电话会议对应的音频数据至智慧语音应用,智慧语音应用根据获取的音频数据进行语音转文字。
应理解,本申请实施例对语音转文字的过程和借助的语音识别等技术不作赘述。
还应理解,每个步骤中的API的类型可能相同或不同,例如实现上述过程中步骤2可以通过WASAPI实现,步骤3可以通过其他接口实现,本申请实施例对具体的实现过程中的API类型等不作限定。
通过上述步骤,就可以实现将电话会议的音频从当前PC上运行的电话会议、视频应用和音乐应用的混合音频中分离出来,该过程需要用户手动设置电话会议通过虚拟声卡来加载,再由多屏互动类应用或智慧语音类应用获取虚拟音频驱动对应的音频数据,操作非常繁琐,且需要新增虚拟音频驱动的节点来实现。
此外,因为音频数据到达驱动层(例如系统默认的音频驱动或虚拟音频驱动)之后,都是混音数据,因此该技术只能获取单一应用的音频数据,例如用户仅仅设置将电话会议的音频通过虚拟声卡来播放,仅获取电话会议的音频数据。当存在多个应用且同时需要获取每一个应用的音频数据的时候,例如用户期望同时获取电话会议、音乐应用和视频应用中每一个应用对应的音频数据时,是无法完成音频数据分流的。
再者,一般的特殊音效(例如杜比音效等)等都挂载在系统默认的音频驱动下,如果用户期望分离视频应用的音频,需要将当前视频的音频挂载到虚拟音频驱动并通过虚拟声卡来播放该当前视频的音频,但是就无法使用特殊音效,降低了用户的观影体验。
因此,本申请实施例提供了一种音频分流的方法,可以实现多个应用的音频数据的分流,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。
其中,在本申请实施例的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,在本申请实施例的描述中,“多个”是指两个或多于两个。
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。
本申请实施例提供的音频分流的方法可以应用于手机、平板电脑、可穿戴设备、车载 设备、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)等电子设备上,本申请实施例对电子设备的具体类型不作任何限制。
示例性的,图3是本申请实施例提供的一例电子设备100的结构示意图。电子设备100可以对应为图1中的PC,该电子设备100包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括触摸传感器180K和指纹传感器180H等。
可以理解的是,本申请实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
处理器110可以是电子设备的神经中枢和指挥中心。处理器110可以根据指令完成取指令,产生操作控制信号,进而执行指令的控制。
在本申请的一些实施例中,处理器110可以用于控制音频模块170采集多个应用对应的音频数据,以及当电子设备100将视频窗口投屏显示在平板、智慧屏等其他电子设备时,处理器110可以控制将视频窗口正在播放的视频对应的音频数据向平板、智慧屏等其他电子设备的输出等。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
USB接口130是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口130可以用于连接充电器为电子设备100充电,也可以用于电子设备100与外围设备之间传输数据。USB接口130也可以用于连接耳机,通过耳机播放音频。该接口还可以用于连接其他电子设备,例如AR设备等。
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备100的结构限定。在本申请另一些实施例中,电子设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过电子设备100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为电子设备供电。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,外部存储器,显示屏194,摄像头193,和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wireless local  area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些实施例中,电子设备100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得电子设备100可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。所述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
在本申请一些实施例中,该电子设备和其他电子设备可以借助各自的天线和移动通信模块,基于无线通信技术,实现信息传输。例如,PC可以将视频播放窗口中对应的画面显示数据和音频数据等发送到平板、手机等其他电子设上,进而在平板、手机等其他电子设备上播放该视频,此处不再赘述。
电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),MiniLED,MicroLED,Micro-OLED,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,电子设备100可以包括1个或N个显示屏194,N为大于1的正整数。
在本申请的一些实施例中,GPU可以用于渲染应用界面,对应的,显示屏194可以用于显示GPU渲染的应用界面。例如,在图1所示的场景中,PC 100的GPU可以根据视频应用对应的图像数据等渲染界面,并在PC 100的显示屏上显示视频播放窗口10的画面,此处不再赘述。
电子设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行电子设备100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储电子设备100使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。
扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。电子设备100可以通过扬声器170A收听音乐,或收听免提通话。
受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。当电子设备100接听电话或语音信息时,可以通过将受话器170B靠近人耳接听语音。
麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风170C发声,将声音信号输入到麦克风170C。电子设备100可以设置至少一个麦克风170C。在另一些实施例中,电子设备100可以设置两个麦克风170C,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,电子设备100还可以设置三个,四个或更多麦克风170C,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。
耳机接口170D用于连接有线耳机。耳机接口170D可以是USB接口130,也可以是3.5mm的开放移动电子设备平台(open mobile terminal platform,OMTP)标准接口,美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。
在本申请一些实施例中,音频模块170中的任意一种音频器件,例如扬声器170A或耳机接口170D外接的耳机等都可以同时播放来源于多个应用的音频数据。示例性的,如图1中的(a)图所示的场景,用户可以设置通过耳机同时播放视频片段的音频和电话会议的音频。
指纹传感器180H用于采集指纹。电子设备100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
触摸传感器180K,也称“触控面板”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作 用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于电子设备100的表面,与显示屏194所处的位置不同。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。电子设备100可以接收按键输入,产生与电子设备100的用户设置以及功能控制有关的键信号输入。
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。
在本申请中,电子设备100的操作系统可以包括但不限于
Figure PCTCN2022084069-appb-000003
Figure PCTCN2022084069-appb-000004
鸿蒙(Harmony)等操作系统,本申请实施例对此不限定。下面将以具有
Figure PCTCN2022084069-appb-000005
系统的PC为例,示例性说明电子设备100的软件结构。
图4是本申请实施例提供的一例电子设备100的软件结构框图。
如图4所示,
Figure PCTCN2022084069-appb-000006
系统从总体上分为内核模式(kernel model)和用户模式(user model)。其中,内核模式和用户模式可以运行在中央处理器(central processing unit,CPU)的不同特权层,例如内核模式(kernel model)可以运行在CPU的第0层,将用户模式可以运行在CPU的第3层。
Figure PCTCN2022084069-appb-000007
系统的每层有若干组件组成,其作为一个整体,
Figure PCTCN2022084069-appb-000008
系统的运行依赖于上层组件对下层组件的调用。每层组件都有固定的接口供上层调用,高层如果要进行更改权限操作需要向底层提出请求。
如图4所示,用户模式下,应用程序层可以包括音乐应用、视频应用、电话会议应用、游戏应用等,以及翻译、语音转文字等多屏互动类应用、智慧语音类应用等,本申请实施例对此不作限定。
可选地,应用程序层可以调用其对应的子系统(例如Win32、POSIX、OS/2等)中的应用程序接口(application programming interface,API)。应用程序层可以通过API来调用系统服务函数,进而调用相应的服务,而应用程序之间则保持相对隔离,它们相互间的通信需要通过系统来完成。操作系统可以提供一些基本的进程间通信来支持应用程序层的应用程序之间的交互操作。
示例性的,
Figure PCTCN2022084069-appb-000009
子系统可以将API函数转化为Native API,实现对应用程序的兼容。在Native API中的函数调用被转化为系统服务函数调用并进入内核模式,并进一步向下传递实现相应功能。
应理解,
Figure PCTCN2022084069-appb-000010
系统的内核模式可以实现操作系统的基本机制,内核模式下运行的都是核心代码,这些代码不会受到恶意的攻击。而运行在用户模式下的应用程序是最不安全且容易受到攻击的,所以应用程序权限是受到限制的。如果应用程序进行一些诸如直接访问物理内存的动作,需要向内核模式下的不同执行体(executive)组件提出请求。
Figure PCTCN2022084069-appb-000011
系统的内核模式下,包含了基本的操作系统原语和功能,如驱动程序、执行体(executive)组件等。示例性的,内核模式下可以提供一些可供应用程序层的应用程序或内核驱动程序直接调用的功能和语义,例如输入输出(input/output,I/O)管理器、对 象管理器、进程管理器、虚拟内存管理器、配置管理器和其他组件等。不同的管理器用于管理不同的对象,例如对象管理器可以用于管理执行体(executive)中的对象,此处不再赘述。
Figure PCTCN2022084069-appb-000012
系统内核模式下,与硬件直接打交道的这一层称为硬件抽象层(hardware abstraction layer,HAL),HAL可以把所有与硬件相关联的代码逻辑隔离到一个专门的模块中,从而使上面的层次可能做到独立于硬件平台。
硬件设备可以包括麦克风、扬声器、鼠标、键盘、显示器、磁盘、打印机和网络等,不同的硬件设备可以通过各自的驱动程序,以一致的方式由操作系统来处理。示例性的,系统内核接受应用程序的请求,与硬件设备进行通信;另一方面,硬件设备向计算机发送信号,驱动程序收到信号后,与系统内核一起把信号传递给对应的应用程序。
在本申请实施例中,结合图1所示的场景,以音乐应用为例,当用户点击音乐应用的图标进入音乐应用的运行界面,音乐应用可以通过API来调用系统服务函数,进而调用相关的驱动程序等为音乐应用创建播放资源;之后用户点击播放按钮开始播放音乐,并设置通过耳机输出该音频,音频数据可以通过接口传送到音频驱动,音频驱动可以根据创建的播放资源开始通过耳机等硬件设备播放音频,后续对音频的播放过程不再赘述。
为了便于理解,本申请以下实施例将以具有图3和图4所示结构的PC为例,结合附图和应用场景,对本申请实施例提供的音频分流的方法进行具体阐述。
图5是本申请实施例提供的一例音频分流过程的示意图。
应理解,图5的处理过程可以基于图4所示的
Figure PCTCN2022084069-appb-000013
系统的软件结构,且本申请实施例中,还需要额外构建音频服务平台,该音频服务平台可以通过音频服务相关的软件开发工具包来实现,例如该音频服务相关的软件开发工具包可以提供音频服务模块和音频处理模块。
一种可能的实现方式中,音频服务模块可以由用户预先安装的厂商提供的音频服务(例如华为提供的Huawei audio service)程序来实现,音频处理模块可以由用户预先安装的厂商提供的音频软件开发工具包(例如华为提供的Huawei audio software development kit,Huawei audio SDK)来实现。
其中,华为音频服务(Huawei audio service)程序提供的该音频服务模块可以用来收集播放类应用启动时的进程相关信息,并将进程信息发送给音频处理对象(audio processing objects,APO)模块。APO模块可以理解为用户态的音频驱动,APO模块可以收集每一个播放类应用下发的音频数据,并为每一个应用分配播放资源。
该Huawei audio SDK可以理解为一种可以获取播放类应用的音频数据的接口,或者称为“audio kit”。该接口可以为应用程序层的应用(例如智慧语音类应用、多屏互动类应用等)提供获取指定进程对应的音频数据的能力,并能够实时获取该指定进程对应的音频数据。可以理解的是,Huawei audio service和Huawei audio SDK只是一个名称,还可以为其他名称,还可以为其他名称,用于实现相同的功能,本申请对此不做限制。
可选地,华为音频服务(Huawei audio service)程序和华为音频软件开发工具包(Huawei audio SDK)是华为官方推出的音频服务软件,用户可以登录华为开发者联盟官网进行下载,并搭建相应的音频服务环境,本申请实施例不再赘述。
还应理解,图5中的虚线之上的区域的模块对应为图4中用户模式下的软件模块,虚 线之下的音频处理对象模块(APO)可以理解为用户态的驱动,此处不再赘述。
如图5所示,以用户同时播放多个应用(例如M个应用,M≥2)的音频为例,介绍从多个应用的音频中分离每一个应用的音频数据的方法,该方法500可以包括以下步骤:
501,用户打开一个或多个播放类应用,播放类应用调用WASAPI。
应理解,步骤501中的打开一个或多个播放类应用,可以理解为用户点击该一个或多个播放类应用中每一个应用的图标后,进入该应用的运行界面,此时用户可以没有点击播放按钮即并未开始播放该应用的音频,也可以通过快捷方式打开该应用并同时开始播放音频,本申请实施例对此不作限定。
可选地,该播放类应用可以包括电话会议、视频应用、音乐应用和游戏应用等多个应用中的任意一种或多种。示例性的,以音乐应用为例,在步骤501中,用户可以点击音乐应用的图标进入该音乐应用的运行界面,响应于用户的操作,该音乐应用可以调用
Figure PCTCN2022084069-appb-000014
系统的接口,例如WASAPI等。
502,通过WASAPI获取每个应用对应的进程,即确定每个应用的pid。
可选地,该步骤502可以由WASAPI为每个应用指定相应的进程,也可以由WASAPI调用系统的多个其他接口和函数,进而获取每个应用的pid,本申请实施例对此不作限定。
表1示出了一例可能的应用名称和进程标识的对应关系。当用户运行了多个应用时,如表1所示,系统可以通过WASAPI确定当前运行的每一个应用的pid。例如音乐应用对应pid 1,视频应用对应pid 2,电话会议应用对应pid 3,此处不再一一举例。
表1
应用名称 进程标识(pid)
音乐 pid 1
视频 pid 2
电话会议 pid 3
…… ……
503,音频服务模块获取每个应用的pid。
应理解,音频服务模块可以和
Figure PCTCN2022084069-appb-000015
系统的WASAPI建立连接,音频服务模块可以通过WASAPI访问每一个应用的进程信息。
可选地,对于一个应用而言,该步骤503的发生时机可以是当用户打开该应用且WASAPI获取了该应用对应的pid之后,WASAPI可以自动将该应用对应的pid发送给音频服务模块;或者,音频服务模块可以在用户点击播放该应用的音频之后,再请求获取该应用对应的pid,本申请实施例对此不作限定。
可选地,步骤503中,音频服务模块还可以获取更多的与应用相关的其他信息,例如应用名称等更多的属性信息,以及应用的状态信息等,本申请实施例对此不作限定。其中,应用的状态信息可以用于指示当前应用的播放状态:正在播放或者停止播放等。
504,音频服务模块向APO模块发送包含每个应用的pid的信息,相应地,APO模块接收每个应用的pid的信息,并确定每个应用的pid。
应理解,该步骤503的发生时机可以包括不同的实现方式。
一种可能的实现方式中,如果音频服务模块可以获取到每个应用的播放状态信息,那么在获取到每个应用的播放状态信息之后,当根据每个应用的播放状态信息确定该应用为 正在播放的状态时,将该应用的pid发送给APO模块,对于没有播放的应用,可以不发送该应用的pid。
或者,当音频服务模块获取了每个应用的pid的信息之后,自动将该应用对应的pid发送给APO模块,本申请实施例对此不作限定。
另一种可能的实现方式中,APO模块可以向音频服务模块请求获取每个应用的pid的信息,音频服务模块响应于该APO模块的请求,再执行步骤504的过程,将包含每个应用的pid的信息发送给APO模块,本申请实施例对音频服务模块向APO模块发送每个应用的pid的信息的时机和方式不作限定。
505,APO模块为每个正在播放的应用创建播放资源,且建立每个应用的pid和播放资源之间的一一对应关系。
图6是本申请实施例提供的一例音频流的处理过程示意图。如图6所示,从应用程序层到APO模块的过程中,每一个应用开始运行,APO模块就可以为每一个应用创建播放资源。
如图6所示,APO模块可以划分为三层,从上到下依次为流音效(stream effects,SFX)层、模式音效(mode effects,MFX)层和端点音效(endpoint effects,EFX)层。其中,在每个应用开始播放音频时,SFX层可以为每一个应用唯一分配一个SFX对象,该SFX对象可以接收该应用下发的音频数据。到MFX层往下,所有应用的SFX对象都会混合在一起,形成混音音频,本申请实施例对MFX层和EFX层形成混合音频数据的过程不作赘述。
应理解,在本申请实施例中,前述步骤503中介绍的音频服务模块可以和
Figure PCTCN2022084069-appb-000016
系统的WASAPI建立连接,指的是音频服务模块和SFX层建立连接,换言之,音频服务模块可以通过WASAPI访问SFX层的每一个SFX对象,进而确定每一个应用的进程信息。
示例性的,当用户点击音乐应用并开始播放音乐时,APO模块就可以为音乐应用创建播放资源,该播放资源就可以理解为音乐应用的SFX对象,APO模块再结合步骤504中接收的每个应用的pid的信息,建立pid和SFX之间的一一对应关系。
表2示出了一例可能的应用和播放资源之间的对应关系。如表2所示,APO可以为当前运行的每一个应用分配SFX。例如音乐应用对应SFX 1,视频应用对应SFX 2,电话会议应用对应SFX 3。
表2
应用名称 进程标识(pid) SFX
音乐 pid 1 SFX 1
视频 pid 2 SFX 2
电话会议 pid 3 SFX 3
…… …… ……
506,APO模块将建立的pid和SFX之间的一一对应关系的信息发送给音频处理模块。
可选地,APO模块向音频处理模块(例如Huawei audio SDK)发送pid和SFX之间的一一对应关系的信息可以通过多种可能的方式。例如,APO模块可以周期性地向音频处理模块(例如Huawei audio SDK)发送pid和SFX之间的一一对应关系的信息,或者,APO模块接收音频处理模块的请求,响应于该请求再向音频处理模块发送pid和SFX之 间的一一对应关系的信息,本申请实施例对APO模块向音频处理模块发送pid和SFX之间的一一对应关系的信息的时机和方式不作限定。
应理解,该音频处理模块(例如华为提供的Huawei audio SDK)作为智慧语音类应用、多屏互动类应用等访问音频数据的接口,可以保存该pid和SFX之间的一一对应关系。此外,该接口还可以接受智慧语音类应用、多屏互动类应用等发送的请求消息,为智慧语音类应用、多屏互动类应用等获取指定进程对应的音频数据。
507,音频处理模块接收来自于智慧语音类应用或多屏互动类应用的请求消息,该请求消息包括第一应用的pid,且该请求消息用于请求获取该第一应用的音频数据。
应理解,这里第一应用可以是一个或多个应用,或者说,同时获取多个第一应用的音频数据,本申请实施例对此不作限定。
可选地,当用户设置借助于智慧语音类应用的语音转文字功能将当前电话会议的音频转换为文字记录下来时,该智慧语音类应用可以向音频处理模块(例如华为提供的Huawei audio SDK)发送该请求消息,用于请求获取电话会议的音频数据。该场景下,第一应用可以是电话会议应用。
或者,当用户设置借助于翻译软件等获取正在播放的视频的音频时,该翻译软件可以向Huawei audio SDK发送该请求消息,用于请求获取正在播放的视频的音频数据。该场景下,第一应用可以是视频应用。
又或者借助于多屏互动类应用将该PC的视频切换到平板、智慧屏等其他电子设备显示时,该多屏互动类应用可以向音频处理模块(例如华为提供的Huawei audio SDK)发送该请求消息,用于请求获取正在播放的视频的音频数据。该场景下,第一应用可以是视频应用。此外,当前正在播放的视频切换到平板、智慧屏等其他电子设备显示时,还需要获取该视频的图像帧,本申请实施例对图像帧的获取过程和方式不作限定。
或者,该第一应用为多个应用,即同时获取电话会议的音频数据和视频应用的音频数据。示例性的,当用户通过智慧语音类应用的语音转文字功能获取当前电话会议的音频,同时通过翻译软件等获取正在播放的视频的音频,当音频处理模块收到智慧语音类应用的请求消息和翻译软件的请求消息时,该音频处理模块可以根据电话会议的pid确定SFX3,并获取SFX3的音频数据,以及根据视频应用的pid确定SFX2,并获取SFX2的音频数据,执行相应的处理过程,即可以实现同时获取多个不同应用的音频数据,实现音频数据的分流。
508,音频处理模块根据第一应用的pid确定该第一应用的SFX,并从该第一应用的SFX获取音频数据。
应理解,音频处理模块(例如华为提供的Huawei audio SDK)获取请求消息中包括的一个或多个应用的pid,根据每一个应用的pid确定该应用对应的SFX,并直接去该SFX上获取该应用的音频数据。
还应理解,原生的
Figure PCTCN2022084069-appb-000017
系统的APO模块的音频数据流是混合了多个应用的音频数据,APO模块无法知道每一个音频数据是属于的哪一个应用的。
509,音频处理模块再将获取的第一应用的音频数据返回给智慧语音类应用或多屏互动类应用。
一种可能的场景中,步骤507中为用户借助于智慧语音类应用的语音转文字功能,期 望将当前电话会议的音频转换为文字记录下来时,那么步骤509中,音频处理模块将获取的电话会议应用的音频数据返回给智慧语音类应用,以使得智慧语音类应用可以将当前的音频数据转化为文字。
或者,步骤507中为用户借助于翻译软件的翻译服务,期望将正在播放的视频的英文音频转化为中文字幕时,那么步骤509中,音频处理模块将获取的视频应用的音频数据返回给该翻译软件,以使得该翻译软件可以实时将当前的英文音频转化为中文字幕显示在屏幕上。
又或者,步骤507中为用户借助于多屏互动类应用,期望将该PC的视频切换到平板、智慧屏等其他电子设备显示,那么步骤509中,音频处理模块将获取的视频应用的音频数据和图像显示数据等返回给该多屏互动类应用,以使得该多屏互动类应用可以根据PC的视频对应的音频和图像都切换到平板、智慧屏等其他的电子设备,此处不再一一举例。
通过上述方法,对于多个应用程序同时使用一个音频器件播放音频的过程,在APO模块混合多个应用的音频数据之前,由音频服务模块获取每个应用的进程信息,并将进程信息发送给APO模块,便于APO模块能够精准地根据每个应用的进程信息匹配对应的SFX,即为每个应用程序指定唯一的SFX,进而用户可以准确地获取多个应用程序中每一个应用程序对应的音频数据。该过程不需要用户通过繁琐的操作切换音频驱动和音频器件,简化了操作流程。
此外,该方法能够实现从多个应用程序的音频数据中同时分流出多个应用程序的音频数据,当存在多个应用且同时需要获取每一个应用的音频数据的时候,可以提供给用户分布式音频服务的能力,快速且准确地完成音频数据分流。再者,该过程不需要额外增加虚拟音频驱动和虚拟音频器件,不会影响特殊音效的使用,提高了用户的使用体验。
图7是本申请实施例提供的一例音频流的处理过程的时序图。如图7所示,该处理过程700包括播放之前——开始播放——停止播放不同的阶段,下面分别介绍每一个阶段包括的过程。
可选地,在图7所示的处理过程700中,音频服务模块可以由用户预先安装的厂商提供的音频服务(例如华为提供的Huawei audio service)程序来实现,音频处理模块可以由用户预先安装的厂商提供的音频软件开发工具包(例如华为提供的Huawei audio SDK)来实现,后续不再赘述。
阶段一:播放之前
701,用户打开播放类应用,调用
Figure PCTCN2022084069-appb-000018
系统的接口,为每个应用分别指定相应的进程,即确定每个应用的pid。
可选地,该播放类应用可以包括电话会议、视频应用、音乐应用和游戏应用等任意一种或多种。示例性的,以音乐应用为例,在阶段一中,用户可以点击音乐应用的图标进入该音乐应用的运行界面,响应于用户的操作,该音乐应用可以调用
Figure PCTCN2022084069-appb-000019
系统的接口,例如WASAPI等。
示例性的,该步骤701可以参照前述方法500中的步骤502的过程,
Figure PCTCN2022084069-appb-000020
系统的接口可以为每个应用确定如表1所列举的pid,此次不再一一赘述。
702,
Figure PCTCN2022084069-appb-000021
系统和音频服务模块建立连接。换言之,当用户运行播放类应用时,可以触发
Figure PCTCN2022084069-appb-000022
系统通过WASAPI和本申请实施例提供的音频服务模块建立连接,后 续过程中可以交换相关的信息。
703,音频服务模块为APO创建控制通道(control pipe)和数据通道(data pipe)。
704,
Figure PCTCN2022084069-appb-000023
系统通过WASAPI触发APO创建播放资源。可选地,这里创建播放资源可以理解为创建新的SFX。
阶段二:开始播放
705,开始播放。
示例性的,以音乐应用为例,在阶段二中,用户可以点击音乐应用的播放按钮,开始播放音乐。
706,
Figure PCTCN2022084069-appb-000024
系统通过WASAPI通知音频服务模块当前已经开始播放。
应理解,这里“当前已经开始播放”可以理解为当前应用的状态信息——开始播放的状态。
707,音频服务模块获取进程标识。可选地,音频服务模块还可以获取更多的与应用相关的其他信息,例如应用名称等更多的属性信息等,本申请实施例对此不作限定。
708,音频服务模块向APO发送进程标识的信息。可选地,音频服务模块可以通过前述步骤703中创建的控制管道向APO发送包含进程标识的信息,例如音频服务模块可以向APO发送表1中列举的进程标识和应用名称的信息。应理解,本申请实施例对进程标识的数量或者应用数量不作限定。
709,音乐应用开始向底层
Figure PCTCN2022084069-appb-000025
系统发送音频数据。
710,
Figure PCTCN2022084069-appb-000026
系统通过APO使令化SFX。可选地,使令化SFX可以理解为将步骤704中新建的SFX划分给引用应用。
711,APO确定进程标识和SFX之间的对应关系。
具体地,步骤707-708中,音频服务模块将获取的每个应用以及每个应用对应的进程标识发送给APO,APO可以为每一个进程配对一个SFX。相应地,对于多个应用对应的多个进程,APO可以确定进程和SFX之间的一一对应关系。
示例性的,该步骤711可以参照前述方法500中的步骤505的过程,APO模块可以根据每个应用的pid和SFX确定如表2所列举的对应关系,此处不再一一赘述。
712,APO向音频处理模块发送该进程标识和SFX之间的对应关系的信息。
通过上述步骤,音频处理模块(例如华为提供的Huawei audio service)获取了进程标识和SFX之间的对应关系,当用户调用多屏互动类应用、智慧语音类应用时,该多屏互动类应用、智慧语音类应用可以向音频处理模块发出请求,请求获取目标应用的音频数据。音频处理模块可以根据该目标应用的pid,进一步指导该目标应用的SFX,然后去对应的SFX获取音频数据,即为该目标应用的音频数据。
应理解,步骤706至步骤709之间的时延经过测试,为不到1毫秒,该过程可以保证获取音频数据的实时性。
713,智慧语音类应用或多屏互动类应用等向音频处理模块请求获取第一应用的音频数据。应理解,该请求消息中包括第一应用的pid。
714,音频处理模块确定第一应用的pid,并根据第一应用的pid确定该第一应用的SFX。
715,音频处理模块从该第一应用的SFX获取音频数据。
716,音频处理模块向智慧语音类应用或多屏互动类应用等返回第一应用的音频数据。
一种可能的场景中,步骤713中可以为用户借助于智慧语音类应用的语音转文字功能,期望将当前电话会议的音频转换为文字记录下来时,那么步骤716中,音频处理模块将获取的电话会议应用的音频数据返回给智慧语音类应用,以使得智慧语音类应用可以将当前的音频数据转化为文字。
或者,步骤713中为用户借助于翻译软件的翻译服务,期望将正在播放的视频的英文音频转化为中文字幕时,那么步骤716中,音频处理模块将获取的视频应用的音频数据返回给该翻译软件,以使得该翻译软件可以实时将当前的英文音频转化为中文字幕显示在屏幕上。
又或者,步骤713中为用户借助于多屏互动类应用,期望将该PC的视频切换到平板、智慧屏等其他电子设备显示,那么步骤716中,音频处理模块将获取的视频应用的音频数据和图像显示数据等返回给该多屏互动类应用,以使得该多屏互动类应用可以根据PC的视频对应的音频和图像都切换到平板、智慧屏等其他的电子设备,此处不再一一举例。
通过上述方法,对于多个应用程序同时使用一个音频器件播放音频的过程,在APO模块混合多个应用的音频数据之前,由音频服务模块获取每个应用的进程信息,并将进程信息发送给APO模块,便于APO模块能够精准地根据每个应用的进程信息匹配对应的SFX,即为每个应用程序指定唯一的SFX,进而用户可以准确地获取多个应用程序中每一个应用程序对应的音频数据。该过程不需要用户通过繁琐的操作切换音频驱动和音频器件,简化了操作流程。
此外,该方法能够实现从多个应用程序的音频数据中同时分流出多个应用程序的音频数据,当存在多个应用且同时需要获取每一个应用的音频数据的时候,可以提供给用户分布式音频服务的能力,快速且准确地完成音频数据分流。再者,该过程不需要额外增加虚拟音频驱动和虚拟音频器件,不会影响特殊音效的使用,提高了用户的使用体验。
阶段三:停止播放
717-718,当用户停止播放之后,应用程序可以向底层
Figure PCTCN2022084069-appb-000027
系统发送停止播放的指令,
Figure PCTCN2022084069-appb-000028
系统可以调用相关接口执行中断操作,即中断播放资源。
719-720-721,当用户关闭该应用程序之后,该应用程序可以向底层
Figure PCTCN2022084069-appb-000029
系统发送终止音频服务的指令,
Figure PCTCN2022084069-appb-000030
系统可以调用相关接口执行终止操作,同时删除之前建立的通信通道,例如控制通道和数据通道等。
应理解,图7的时序过程仅为一种示例,具体的实现过程中,可以包含上述介绍的部分或全部步骤,或者部分步骤之间的顺序可以进行调整,本申请实施例对此不作限定。
通过上述方法,当同时通过一个相同的音频器件播放多个应用的音频时,通过在驱动层和用户态建立会话,时序上采用多路音频数据并行且精准配对的方法,在多个应用的音频形成混合音频之前,锁定并截取多个目标应用的音频数据,能够同时分流多个目标应用的音频数据。该过程不需要用户通过繁琐的操作切换音频驱动和音频器件,简化了操作流程。
可以理解的是,电子设备为了实现上述功能,其包含了执行各个功能相应的硬件和/或软件模块。结合本文中所公开的实施例描述的各示例的算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方 式来执行,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以结合实施例对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本实施例可以根据上述方法示例对电子设备进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块可以采用硬件的形式实现。需要说明的是,本实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用对应各个功能划分各个功能模块的情况下,电子设备可以包括:显示单元、检测单元和处理单元等。其中,显示单元、检测单元和处理单元相互配合,可以用于支持电子设备执行上述介绍的多个步骤,和/或用于本文所描述的技术的其他过程。
需要说明的是,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
本实施例提供的电子设备,用于执行上述音频分流的方法,因此可以达到与上述实现方法相同的效果。
在采用集成的单元的情况下,电子设备可以包括处理模块、存储模块和通信模块。其中,处理模块可以用于对电子设备的动作进行控制管理,例如,可以用于支持电子设备执行上述显示单元、检测单元和处理单元执行的步骤。存储模块可以用于支持电子设备执行存储程序代码和数据等。通信模块,可以用于支持电子设备与其他设备的通信。
其中,处理模块可以是处理器或控制器。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理(digital signal processing,DSP)和微处理器的组合等等。存储模块可以是存储器。通信模块具体可以为射频电路、蓝牙芯片、Wi-Fi芯片等与其他电子设备交互的设备。
在一个实施例中,当处理模块为处理器,存储模块为存储器时,本实施例所涉及的电子设备可以为具有图3所示结构的设备。
本实施例还提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机指令,当该计算机指令在电子设备上运行时,使得电子设备执行上述相关方法步骤实现上述实施例中的音频分流的方法。
本实施例还提供了一种计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述相关步骤,以实现上述实施例中的音频分流的方法。
另外,本申请的实施例还提供一种装置,这个装置具体可以是芯片,组件或模块,该装置可包括相连的处理器和存储器;其中,存储器用于存储计算机执行指令,当装置运行时,处理器可执行存储器存储的计算机执行指令,以使芯片执行上述各方法实施例中的音频分流的方法。
其中,本实施例提供的电子设备、计算机可读存储介质、计算机程序产品或芯片均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。
通过以上实施方式的描述,所属领域的技术人员可以了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配 由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上内容,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (15)

  1. 一种音频分流的方法,其特征在于,应用于包括一个或多个音频器件的电子设备,所述方法包括:
    接收用户的第一操作,响应于所述第一操作,所述电子设备通过第一音频器件播放M个应用对应的音频,其中,所述第一音频器件是所述一个或多个音频器件中的任意一个,所述M个应用中的每一个应用是能够输出对应的音频的应用,M为大于或等于2的整数;
    接收用户的第二操作,所述第二操作用于请求获取第一应用的音频数据,所述第一应用为所述M个应用中的任意一个应用;
    响应于所述第二操作,从所述M个应用关联的播放资源中确定所述第一应用的播放资源;
    从所述第一应用的播放资源上,获取所述第一应用的音频数据。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    确定所述M个应用对应的M个进程标识和所述M个应用对应的播放资源之间的映射关系,其中,所述M个应用中的每一个应用对应一个进程标识,且所述M个应用中的每一个应用对应一个播放资源。
  3. 根据权利要求2所述的方法,其特征在于,所述从所述M个应用对应的播放资源中确定所述第一应用的播放资源,包括:
    确定所述第一应用的进程标识;
    根据所述第一应用的进程标识和所述映射关系,从所述M个播放资源中确定所述第一应用的播放资源。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述方法还包括:
    所述电子设备运行所述M个应用,并为所述M个应用分配M个播放资源;
    获取所述M个应用对应的所述M个进程标识。
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述方法还包括:
    接收用户的第三操作,所述第三操作用于请求中断播放所述第一应用的音频;
    响应于所述第三操作,所述第一应用的播放资源暂停接收所述第一应用的音频数据。
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,所述方法还包括:
    接收用户的第四操作,所述第四操作用于关闭所述第一应用;
    响应于所述第四操作,所述第一应用的播放资源暂停接收所述第一应用的音频数据且释放所述第一应用的播放资源。
  7. 根据权利要求1至6中任一项所述的方法,其特征在于,所述M个应用包括音乐应用、视频应用、游戏应用、会议应用中的任意应用。
  8. 一种电子设备,其特征在于,包括:
    一个或多个音频器件;
    一个或多个处理器;
    一个或多个存储器;
    安装有多个应用程序的模块;
    所述存储器存储有一个或多个程序,当所述一个或者多个程序被所述处理器执行时, 使得所述电子设备执行以下步骤:
    接收用户的第一操作,响应于所述第一操作,通过第一音频器件播放M个应用对应的音频,其中,所述第一音频器件是所述一个或多个音频器件中的任意一个,所述M个应用中的每一个应用是能够输出对应的音频的应用,M为大于或等于2的整数;
    接收用户的第二操作,所述第二操作用于请求获取第一应用的音频数据,所述第一应用为所述M个应用中的任意一个应用;
    响应于所述第二操作,从所述M个应用关联的播放资源中确定所述第一应用的播放资源;
    从所述第一应用的播放资源上,获取所述第一应用的音频数据。
  9. 根据权利要求8所述的电子设备,其特征在于,当所述一个或者多个程序被所述处理器执行时,使得所述电子设备执行以下步骤:
    确定所述M个应用对应的M个进程标识和所述M个应用对应的播放资源之间的映射关系,其中,所述M个应用中的每一个应用对应一个进程标识,且所述M个应用中的每一个应用对应一个播放资源。
  10. 根据权利要求9所述的电子设备,其特征在于,当所述一个或者多个程序被所述处理器执行时,使得所述电子设备执行以下步骤:
    确定所述第一应用的进程标识;
    根据所述第一应用的进程标识和所述映射关系,从所述M个播放资源中确定所述第一应用的播放资源。
  11. 根据权利要求8至10中任一项所述的电子设备,其特征在于,当所述一个或者多个程序被所述处理器执行时,使得所述电子设备执行以下步骤:
    运行所述M个应用,并为所述M个应用分配M个播放资源;
    获取所述M个应用对应的所述M个进程标识。
  12. 根据权利要求8至11中任一项所述的电子设备,其特征在于,当所述一个或者多个程序被所述处理器执行时,使得所述电子设备执行以下步骤:
    接收用户的第三操作,所述第三操作用于请求中断播放所述第一应用的音频;
    响应于所述第三操作,所述第一应用的播放资源暂停接收所述第一应用的音频数据。
  13. 根据权利要求8至12中任一项所述的电子设备,其特征在于,当所述一个或者多个程序被所述处理器执行时,使得所述电子设备执行以下步骤:
    接收用户的第四操作,所述第四操作用于关闭所述第一应用;
    响应于所述第四操作,所述第一应用的播放资源暂停接收所述第一应用的音频数据且释放所述第一应用的播放资源。
  14. 根据权利要求8至13中任一项所述的电子设备,其特征在于,所述M个应用包括音乐应用、视频应用、游戏应用、会议应用中的任意应用。
  15. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机指令,当所述计算机指令在电子设备上运行时,使得所述电子设备执行如权利要求1至7中任一项所述的方法。
PCT/CN2022/084069 2021-05-28 2022-03-30 一种音频分流的方法及电子设备 WO2022247455A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110598410.0A CN115407962A (zh) 2021-05-28 2021-05-28 一种音频分流的方法及电子设备
CN202110598410.0 2021-05-28

Publications (1)

Publication Number Publication Date
WO2022247455A1 true WO2022247455A1 (zh) 2022-12-01

Family

ID=84156363

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/084069 WO2022247455A1 (zh) 2021-05-28 2022-03-30 一种音频分流的方法及电子设备

Country Status (2)

Country Link
CN (1) CN115407962A (zh)
WO (1) WO2022247455A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117082403A (zh) * 2023-08-18 2023-11-17 荣耀终端有限公司 多路音频处理方法、及电子设备
CN117707464A (zh) * 2023-07-21 2024-03-15 荣耀终端有限公司 音频处理方法及相关设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9411809B1 (en) * 2014-03-07 2016-08-09 Amazon Technologies, Inc. Remote content presentation queues
CN110381215A (zh) * 2019-07-30 2019-10-25 深圳市沃特沃德股份有限公司 音频分流方法、装置、存储介质及计算机设备
CN110868621A (zh) * 2018-08-27 2020-03-06 中兴通讯股份有限公司 一种音频播放方法、装置、设备及计算机可读介质
CN113890932A (zh) * 2020-07-02 2022-01-04 华为技术有限公司 一种音频控制方法、系统及电子设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9411809B1 (en) * 2014-03-07 2016-08-09 Amazon Technologies, Inc. Remote content presentation queues
CN110868621A (zh) * 2018-08-27 2020-03-06 中兴通讯股份有限公司 一种音频播放方法、装置、设备及计算机可读介质
CN110381215A (zh) * 2019-07-30 2019-10-25 深圳市沃特沃德股份有限公司 音频分流方法、装置、存储介质及计算机设备
CN113890932A (zh) * 2020-07-02 2022-01-04 华为技术有限公司 一种音频控制方法、系统及电子设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117707464A (zh) * 2023-07-21 2024-03-15 荣耀终端有限公司 音频处理方法及相关设备
CN117082403A (zh) * 2023-08-18 2023-11-17 荣耀终端有限公司 多路音频处理方法、及电子设备

Also Published As

Publication number Publication date
CN115407962A (zh) 2022-11-29

Similar Documents

Publication Publication Date Title
WO2020098437A1 (zh) 一种播放多媒体数据的方法及电子设备
WO2021036809A1 (zh) 一种sim模块的管理方法及电子设备
WO2022247455A1 (zh) 一种音频分流的方法及电子设备
WO2020155014A1 (zh) 智能家居设备分享系统、方法及电子设备
WO2022100610A1 (zh) 投屏方法、装置、电子设备及计算机可读存储介质
CN113923230B (zh) 数据同步方法、电子设备和计算机可读存储介质
WO2022100304A1 (zh) 应用内容跨设备流转方法与装置、电子设备
WO2020078336A1 (zh) 翻译方法及终端
WO2020042119A1 (zh) 一种消息传输方法及设备
WO2021052204A1 (zh) 基于通讯录的设备发现方法、音视频通信方法及电子设备
WO2022135527A1 (zh) 一种视频录制方法及电子设备
WO2020073536A1 (zh) 语音切换方法、电子设备及系统
WO2022166618A1 (zh) 一种投屏的方法和电子设备
WO2021197071A1 (zh) 无线通信系统及方法
US20230350629A1 (en) Double-Channel Screen Mirroring Method and Electronic Device
WO2024016503A1 (zh) 一种通信方法及电子设备
WO2022127670A1 (zh) 一种通话方法、相关设备和系统
CN116264598A (zh) 一种多屏协同的通话方法、系统、终端及存储介质
WO2022233237A1 (zh) 一种音频播放方法、装置和设备
WO2023134509A1 (zh) 视频推流方法、装置、终端设备及存储介质
CN116301541A (zh) 共享文件的方法、电子设备及计算机可读存储介质
CN115242994B (zh) 视频通话系统、方法和装置
CN113489600B (zh) 一种网络参数的配置方法、装置以及存储介质
CN117425227A (zh) 建立基于WiFi直接连接的会话的方法和装置
CN116982042A (zh) 灵活授权的访问控制方法、相关装置及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22810188

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22810188

Country of ref document: EP

Kind code of ref document: A1