CN114416011B

CN114416011B - Terminal, audio control method and storage medium

Info

Publication number: CN114416011B
Application number: CN202111342130.XA
Authority: CN
Inventors: 耿炳钰; 李秀勇
Original assignee: Hisense Mobile Communications Technology Co Ltd
Current assignee: Hisense Mobile Communications Technology Co Ltd
Priority date: 2021-11-12
Filing date: 2021-11-12
Publication date: 2024-03-15
Anticipated expiration: 2041-11-12
Also published as: CN114416011B8; CN114416011A

Abstract

The application discloses a terminal, an audio control method and a storage medium, which belong to the technical field of audio processing, wherein the method comprises the following steps: acquiring audio data to be output of a first audio object, determining whether a second audio object taking the audio data to be output of the first audio object as input exists or not based on a conversion relation between the stored audio data to be output of the audio object and the audio data to be input of the audio object, if so, converting the audio data to be output of the first audio object into the audio data to be input of the second audio object, transmitting the converted audio data to be input to the second audio object for processing, and then transmitting the audio data processed by the second audio object to the outside. Therefore, the transmission direction of the audio data to be output of the first audio object can be changed to be one path of audio data to be input of the second audio object, so that the audio data among different audio objects are linked, and the transmission flexibility of the audio data can be improved.

Description

Terminal, audio control method and storage medium

Technical Field

The present disclosure relates to the field of audio processing technologies, and in particular, to a terminal, an audio control method, and a storage medium.

Background

In the related art, audio objects on terminals such as cellular phones and applications generally generate audio data such as phone alert tones, message notification tones, call sounds, and video sounds provided by applications. The audio data of the different audio objects are independent of each other, and the terminal has no administrative meaning to the audio data in the different audio objects.

Disclosure of Invention

The embodiment of the application provides a terminal, an audio control method and a storage medium, which are used for providing a management scheme of sound in the terminal.

In a first aspect, an embodiment of the present application provides a terminal, including:

a processor configured to acquire audio data to be output of a first audio object;

determining whether a second audio object taking the audio data to be output of the first audio object as input exists or not based on a conversion relation between the stored audio data to be output of the audio object and the audio data to be input of the audio object;

if the second audio object is determined to exist, converting the audio data to be output of the first audio object into the audio data to be input of the second audio object;

transmitting the audio data to be input obtained through conversion to the second audio object for processing;

And the sending component is configured to send the audio data processed by the second audio object outwards.

In some embodiments, the processor is specifically configured to:

determining whether a second audio object taking the audio data to be output of the first audio object as input exists or not based on a temporary conversion relation between the stored audio data to be output of the audio object and the audio data to be input of the audio object;

if the first audio object is not the audio object, determining whether a second audio object taking the audio data to be output of the first audio object as input exists or not based on a default conversion relation between the stored audio data to be output of the audio object and the audio data to be input of the audio object.

In some embodiments, the processor is further configured to:

if the second audio object is determined to be the current audio data to be input, mixing the current audio data to be input and the audio data to be input obtained through conversion to obtain mixed input audio data;

and sending the mixed input audio data to the second audio object for processing.

In some embodiments, the processor is further configured to:

acquiring audio data to be output of the second audio object;

Mixing the audio data to be output of the first audio object and the audio data to be output of the second audio object to obtain mixed output audio data;

further comprises:

an audio playing part configured to output the mixed output audio data.

In some embodiments, the processor is further configured to:

and for any one of the audio data to be output of the first audio object, the audio data to be output of the second audio object and the mixed output audio data, responding to a control operation of the audio data, and performing corresponding control processing on the audio data, wherein the control operation comprises output suspension, output starting and output ending.

In some embodiments, if there are at least two audio output modes, the processor is further configured to:

responding to the switching operation of the audio output mode, and determining the current audio output mode;

the audio playing component is specifically configured to output the mixed output audio data through a determined audio output mode.

In some embodiments, when the first audio object is an application, the second audio object is a different application than the application or the second audio object is a cellular telephone; when the first audio object is a cellular phone, the second audio object is an application.

In a second aspect, an embodiment of the present application provides an audio control method, including:

acquiring audio data to be output of a first audio object;

and sending the audio data processed by the second audio object outwards.

In some embodiments, determining whether there is a second audio object having audio data to be output of the first audio object as input based on a conversion relationship between the saved audio data to be output of the audio object and the audio data to be input of the audio object, includes:

In some embodiments, further comprising:

acquiring audio data to be output of the second audio object;

outputting the mixed output audio data.

In some embodiments, further comprising:

In some embodiments, if there are at least two audio output modes, further comprising:

and outputting the mixed output audio data through the determined audio output mode.

In a third aspect, embodiments of the present application provide a storage medium, where instructions in the storage medium, when executed by a processor of a terminal, enable the terminal to perform any one of the above-described audio control methods.

In the embodiment of the application, the audio data to be output of the first audio object is obtained, whether a second audio object taking the audio data to be output of the first audio object as input exists is determined based on the conversion relation between the stored audio data to be output of the audio object and the audio data to be input of the audio object, if so, the audio data to be output of the first audio object is converted into the audio data to be input of the second audio object, the converted audio data to be input is sent to the second audio object to be processed, and then the audio data processed by the second audio object is sent to the outside. Therefore, the transmission direction of the audio data to be output of the first audio object can be changed to be one path of audio data to be input of the second audio object, so that the audio data among different audio objects are linked, and the transmission flexibility of the audio data can be improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:

fig. 1 is a schematic structural diagram of a terminal according to an embodiment of the present application;

fig. 2 is a schematic software architecture diagram of a terminal according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of another terminal according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a path configuration interface according to an embodiment of the present disclosure;

fig. 5 is a schematic diagram of a call interface according to an embodiment of the present application;

fig. 6 is a schematic diagram of a processing procedure of audio data in a terminal according to an embodiment of the present application;

fig. 7 is a schematic diagram of a processing procedure of audio data in another terminal according to an embodiment of the present application;

fig. 8 is a flowchart of an audio control method according to an embodiment of the present application;

FIG. 9 is a flowchart of yet another audio control method according to an embodiment of the present disclosure;

fig. 10 is a schematic structural diagram of an audio control device according to an embodiment of the present application.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the technical solutions of the present application, but not all embodiments. All other embodiments, which can be made by a person of ordinary skill in the art without any inventive effort, based on the embodiments described in the present application are within the scope of the technical solution of the present application.

The terminal in the embodiment of the application may be a mobile phone, an Ipad, a tablet computer, a wearable device, a vehicle-mounted unit and other various electronic devices, which is not limited in the embodiment of the application. Fig. 1 is a schematic structural diagram of a terminal 100 provided in an embodiment of the present application, and it should be understood that the terminal 100 shown in fig. 1 is only an example, and the terminal 100 may have more or fewer components than shown in fig. 1, may combine two or more components, or may have different component configurations. The various components shown in the figures may be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and/or application specific integrated circuits.

A hardware configuration block diagram of the terminal 100 according to an exemplary embodiment is exemplarily shown in fig. 1. As shown in fig. 1, the terminal 100 includes: radio Frequency (RF) circuitry 110, memory 120, display unit 130, camera 140, sensor 150, audio circuitry 160, audio playback component 170, wireless fidelity (Wireless Fidelity, wi-Fi) module 180, processor 190, bluetooth module 1100, and power supply 1200.

The RF circuit 110 may be configured to receive and transmit signals during the process of receiving and transmitting information or communication, and may receive downlink data of the base station and then transmit the downlink data to the processor 190 for processing; uplink data may be sent to the base station. In general, RF circuit 110 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like.

Memory 120 may be used to store software programs and data. Processor 190 performs various functions of terminal 100 and data processing by executing software programs or data stored in memory 120. Memory 120 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. The memory 120 stores an operating system that enables the terminal 100 to operate. The memory 120 in the present application may store an operating system and various application programs, and may also store code for performing the methods described in the embodiments of the present application.

The display unit 130 may be used to receive input digital or character information, generate signal inputs related to user settings and function control of the terminal 100, and in particular, the display unit 130 may include a touch screen 1301 disposed at the front of the terminal 100, and may collect touch operations on or near the user, such as clicking buttons, dragging scroll boxes, and the like.

The display unit 130 may also be used to display information input by a user or information provided to the user and a graphical user interface (graphical user interface, GUI) of various menus of the terminal 100. In particular, the display unit 130 may include a display screen 1302 disposed on a front surface of the terminal 100. The display screen 1302 may be a color liquid crystal screen, and may be configured in the form of a liquid crystal display, a light emitting diode, or the like. The display unit 130 may be used to display various graphical user interfaces described in this application.

The touch screen 1301 may cover the display screen 1302, or the touch screen 1301 may be integrated with the display screen 1302 to implement input and output functions of the terminal 100, and after integration, the touch screen may be simply referred to as a touch screen. The display unit 130 may display an application program and corresponding operation steps.

The camera 140 may be used to capture still images or video. The object generates an optical image through the lens and projects the optical image onto the photosensitive element. The photosensitive element may be a charge coupled device (charge coupled device, CCD) or a Complementary Metal Oxide Semiconductor (CMOS) phototransistor. The photosensitive elements convert the optical signals to electrical signals, which are then transferred to the processor 190 for conversion to digital image signals.

The terminal 100 may further include at least one sensor 150, such as an acceleration sensor 151, a distance sensor 152, a fingerprint sensor 153, a temperature sensor 154. The terminal 100 may also be configured with other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, light sensors, motion sensors, and the like.

The audio circuitry 160, the audio playback component 170 may provide an audio interface between the user and the terminal 100. The audio circuit 160 may transmit the received electrical signal converted from audio data to the speaker 171, and convert the electrical signal into a sound signal to be output by the speaker 171. The terminal 100 may also be configured with a volume button for adjusting the volume of the sound signal. On the other hand, the microphone 172 converts the collected sound signals into electrical signals, which are received by the audio circuit 160 and converted into audio data, which are output to the RF circuit 110 for transmission to, for example, another terminal, or to the memory 120 for further processing. Microphone 172 may capture the voice of the user in this application.

Wi-Fi belongs to a short-range wireless transmission technology, and the terminal 100 can help a user to send and receive e-mail, browse web pages, access streaming media and the like through the Wi-Fi module 180, so that wireless broadband internet access is provided for the user.

Processor 190 is a control center of terminal 100, connects various parts of the entire terminal using various interfaces and lines, and performs various functions of terminal 100 and processes data by running or executing software programs stored in memory 120, and calling data stored in memory 120. In some embodiments, processor 190 may comprise one or more processing units; processor 190 may also integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a baseband processor that primarily handles wireless communications. It will be appreciated that the baseband processor described above may not be integrated into processor 190. Processor 190 may run an operating system, an application program, a user interface display, and a touch response to implement the audio control method provided in the embodiments of the present application. In addition, processor 190 is coupled to display unit 130.

The bluetooth module 1100 is configured to interact with other bluetooth devices having a bluetooth module through a bluetooth protocol. For example, the terminal 100 may establish a bluetooth connection with a wearable electronic device (e.g., a smart watch) also provided with a bluetooth module through the bluetooth module 1100, thereby performing data interaction.

The terminal 100 also includes a power source 1200 (e.g., a battery) that provides power to the various components. The power supply may be logically connected to the processor 190 through a power management system, so that functions of managing charging, discharging, power consumption, etc. are implemented through the power management system. The terminal 100 may also be configured with power buttons for powering on and off the terminal, and for locking the screen, etc.

Fig. 2 is a software configuration block diagram of the terminal 100 according to the embodiment of the present application.

The layered architecture divides the software into several layers, each with distinct roles and branches. The layers communicate with each other through a software interface. In some embodiments, the Android system is divided into four layers, from top to bottom, an application layer, an application framework layer, an Zhuoyun row (Android run) and system libraries, and a kernel layer.

The application layer may include a series of application packages.

As shown in fig. 2, the application package may include applications for cameras, gallery, calendar, phone calls, maps, navigation, WLAN, bluetooth, music, video, short messages, etc.

The application framework layer provides an application programming interface (application programming interface, API) and programming framework for application programs of the application layer. The application framework layer includes a number of predefined functions.

As shown in fig. 2, the application framework layer may include a window manager, a content provider, a view system, a phone manager, a resource manager, a notification manager, and the like.

The window manager is used for managing window programs. The window manager can acquire the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like.

The content provider is used to store and retrieve data and make such data accessible to applications. The data may include video, images, audio, calls made and received, browsing history and bookmarks, phonebooks, etc.

The view system includes visual controls, such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, a display interface including a text message notification icon may include a view displaying text and a view displaying a picture.

The telephony manager is used to provide the communication functions of the terminal 100. Such as the management of call status (including on, hung-up, etc.).

The resource manager provides various resources for the application program, such as localization strings, icons, pictures, layout files, video files, and the like.

The notification manager allows the application to display notification information in a status bar, can be used to communicate notification type messages, can automatically disappear after a short dwell, and does not require user interaction. Such as notification manager is used to inform that the download is complete, message alerts, etc. The notification manager may also be a notification in the form of a chart or scroll bar text that appears on the system top status bar, such as a notification of a background running application, or a notification that appears on the screen in the form of a dialog window. For example, a text message is prompted in a status bar, a prompt tone is emitted, the terminal vibrates, and an indicator light blinks.

Android run time includes a core library and virtual machines. Android run time is responsible for scheduling and management of the Android system.

The core library consists of two parts: one part is a function which needs to be called by java language, and the other part is a core library of android.

The application layer and the application framework layer run in a virtual machine. The virtual machine executes java files of the application program layer and the application program framework layer as binary files. The virtual machine is used for executing the functions of object life cycle management, stack management, thread management, security and exception management, garbage collection and the like.

The system library may include a plurality of functional modules. For example: surface manager (surface manager), media Libraries (Media Libraries), three-dimensional graphics processing Libraries (e.g., openGL ES), 2D graphics engines (e.g., SGL), etc.

The surface manager is used to manage the display subsystem and provides a fusion of 2D and 3D layers for multiple applications.

Media libraries support a variety of commonly used audio, video format playback and recording, still image files, and the like. The media library may support a variety of audio and video encoding formats, such as MPEG4, h.264, MP3, AAC, AMR, JPG, PNG, etc.

The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like.

The 2D graphics engine is a drawing engine for 2D drawing.

The kernel layer is a layer between hardware and software. The inner core layer at least comprises a display driver, a camera driver, an audio driver and a sensor driver.

The workflow of the terminal 100 software and hardware is illustrated below in connection with a multimedia sound scenario for starting a game application.

When touch operation is received by the touch screen 1301, a corresponding hardware interrupt is issued to the kernel layer. The kernel layer processes the touch operation into the original input event (including information such as touch coordinates, time stamp of touch operation, etc.). The original input event is stored at the kernel layer. The application framework layer acquires an original input event from the kernel layer, and identifies a control corresponding to the input event. Taking the touch operation as a touch click operation, taking a control corresponding to the click operation as an example of a control of a game application icon, the game application calls an interface of an application framework layer to start the game application, and then starts an audio driver by calling a kernel layer, and plays a prompt tone, a background tone or other multimedia sounds of the game application through a loudspeaker 171.

Fig. 3 is a schematic structural diagram of another terminal provided in the embodiment of the present application, where the terminal 30 includes a processor 301, a sending component 302, and an audio playing component 303. The embodiment will be specifically described below with reference to the terminal 30 as an example. It should be understood that the terminal 30 shown in fig. 3 is only one example, and that the terminal 30 may have more or fewer components than shown in fig. 3, may combine two or more components, or may have a different configuration of components.

In an actual application scene, a user can interact with the terminal through a hand touch operation, and can also interact with the terminal through auxiliary control equipment such as a touch screen pen, a mouse and the like which are connected with the terminal in a wired or wireless mode. When the microphone is arranged in the terminal, the user can interact with the terminal through voice, and when the camera device is arranged in the terminal, the user can interact with the terminal through gestures.

In general, audio data to be output of each audio object (e.g., application, cellular phone, etc.) in a terminal is output only from this audio object, for example, audio data to be output of a WeChat is output only from a WeChat, audio data to be output of a cellular phone is output only from a cellular phone, so that a transmission path of audio data to be output of each audio object is fixed and unchangeable. In some scenarios, however, it is desirable to flexibly set a transmission path of audio data to be output of one audio object, so as to improve the flexibility of transmission of audio data.

For this reason, in this embodiment of the present application, the processor 301 may obtain the audio data to be output of the first audio object, determine, based on the conversion relationship between the stored audio data to be output of the audio object and the audio data to be input of the audio object, whether there is a second audio object that takes the audio data to be output of the first audio object as input, if there is a second audio object, convert the audio data to be output of the first audio object into the audio data to be input of the second audio object, send the converted audio data to be input to the second audio object for processing, and then send the audio data processed by the second audio object to the outside by the sending component 302.

In particular, when the first audio object is an application, the second audio object may be a different application than the application or the second audio object may be a cellular phone; when the first audio object is a cellular telephone, the second audio object may be an application. That is, the audio data to be output of a certain application may be input to other applications or to the cellular phone, or the audio data to be output of the cellular phone may be input to a certain application, thereby flexibly changing the transmission path of the audio data of one audio object.

It should be noted that, when a cellular phone is used for talking, audio data to be output and audio data to be input are not generated by default, so that the proxy software (proxy) can be used for obtaining the audio data to be output and the audio data to be input in a talking scene for the cellular phone.

In specific implementation, the conversion relationship between the audio data to be output of the audio object and the audio data to be input of the audio object is used for describing the audio transmission direction between different audio objects, so that in order to improve the configuration flexibility of the conversion relationship, the temporary conversion relationship and the default conversion relationship can be configured.

The configuration of these two conversion relationships is described below in connection with specific audio objects.

1. Configuration of default conversion relationships.

Fig. 4 is a schematic diagram of a path configuration interface provided in an embodiment of the present application, where the configurable audio object in fig. 4 includes: cellular phones, weChat, intercom applications (an application providing intercom services, such as an intercom-mounted application) and QQ conferences, the default conversion relationships configured by the user in FIG. 4 are:

default conversion relation 1: the intercom output- > is input by the cellular phone, namely the audio data to be output of intercom application is output by the cellular phone;

Default conversion relation 2: the output of the cellular phone- > intercom input, namely the audio data to be output of the cellular phone is output through intercom application;

default conversion relation 3: the micro-letter output- (intercom input) is that the audio data to be output of the micro-letter is output through intercom application.

In the three default conversion relations, the audio data to be output of each audio object is not output from the audio object, and the audio data to be output of the cellular phone and the audio data to be output of the WeChat can be output through intercom application, so that the audio data to be output of at least two audio objects can be used as input of other audio objects together.

In addition, the access configuration interface can also display a determination button and a cancel button, the configuration is not effective when the cancel button is clicked, and the configuration can be effective all the time when the determination button is clicked.

2. Configuration of temporary conversion relations.

In practice, the user may already be in a talk state, but may want to invite new members to the talk via other applications or cellular phones. For this, when the audio data of the second audio object is call data, the processor 301 may further display an access button on the call interface, display an accessible audio object in response to a trigger operation of the access button, determine a first audio object from among the accessible audio objects in response to an audio object selection operation, and save a temporary conversion relationship between the audio data to be output of the first audio object and the audio data to be input of the second audio object.

Fig. 5 is a schematic diagram of a call interface of an intercom application provided in the embodiment of the present application, when in an intercom state, an access button may be displayed on the call interface, a user may display an accessible application or a cellular phone by clicking the access button, if the user wants to access a QQ conference, a contact in the QQ may be displayed after the user clicks an icon of the QQ conference, and clicking a certain contact may invite the contact to join in a call. In addition, after determining that the user clicks the QQ conference, a temporary conversion relationship 5 may also be established: QQ conference output- (talkback input), namely audio data to be output of the QQ conference is output through the talkback application.

In this case, the user generally only wants to establish a temporary conversion relationship between the output of the first audio object to the input of the second audio object, so that it is possible to configure a single time and a single time.

In the implementation, the default conversion relationship and the temporary conversion relationship can be recorded in a map table mode, in the map table, the source of each conversion relationship is an audio object corresponding to audio data to be output, the output target sink is an audio object corresponding to the audio data to be input, and the map table can be stored in an extensible markup language (eXtensible Markup Language, xml) or other configuration file mode.

The priority of the temporary conversion relationship may be set higher than the default conversion relationship in order to avoid a collision between the default conversion relationship and the temporary conversion relationship. Therefore, when determining whether there is a second audio object that takes the audio data to be output of the first audio object as input based on the conversion relationship between the audio data to be output of the saved audio object and the audio data to be input of the audio object, the processor 301 is specifically configured to determine whether there is a second audio object that takes the audio data to be output of the first audio object as input based on the temporary conversion relationship between the audio data to be output of the saved audio object and the audio data to be input of the audio object, and when not, determine whether there is a second audio object that takes the audio data to be output of the first audio object as input based on the default conversion relationship between the audio data to be output of the saved audio object and the audio data to be input of the audio object.

In particular, the conversion performed when converting the audio data to be output of the first audio object into the audio data to be input of the second audio object includes operations such as encapsulation format conversion, bit width conversion, resampling, etc., and is aimed at converting the audio data to be output of the first audio object into audio data conforming to the input format requirement of the second audio object.

In some scenarios, the second audio object does not currently have audio data to be input, and the processor 301 may directly send the converted audio data to be input to the second audio object for processing. In other scenarios, the second audio object currently has audio data to be input, and the processor 301 may further perform mixing processing on the current audio data to be input and the converted audio data to be input to obtain mixed input audio data, and send the mixed input audio data to the second audio object for processing, so that an opposite end (another terminal) in communication with the terminal may simultaneously receive the output audio data of the first audio object and the input audio data of the second audio object.

In addition, in order to enable the user on the terminal side to hear the output audio data from the first audio object and the second audio object at the same time, the processor 301 may further acquire the audio data to be output of the second audio object, perform mixing processing on the audio data to be output of the first audio object and the audio data to be output of the second audio object to obtain mixed output audio data, and then output the mixed output audio data by the audio playing section 303. In this way, the need for simultaneously outputting audio data of at least two audio objects can be well met.

In order to flexibly control output audio data, the processor 301 may further perform corresponding control processing on the audio data in response to a control operation on the audio data, for any one of the audio data to be output of the first audio object, the audio data to be output of the second audio object, and the mixed output audio data, wherein the control operation includes suspending output, starting output, and ending output.

That is, when the audio data to be output of the first audio object is output from the second audio object alone, the audio data to be output of the first audio object may be paused, started, or ended; when the audio data to be output of the first audio object and the audio data to be output of the second audio object are output together from the second audio object, the audio data to be output of the first audio object may be individually paused, started or ended, the audio data to be output of the second audio object may be individually paused, started or ended, and the mixed output audio data may be individually paused, started or ended.

And when the audio data to be output of the first audio object and the audio data to be output of the second audio object are output from the second audio object together, the audio data to be output of the first audio object or the second audio object can be directly paused, started or ended at the call interface, and the operation interface of the first audio object or the second audio object does not need to be returned for independent control, so that the convenience of audio management can be improved, and the user experience can be further improved.

In practical application, when there are at least two audio output modes, a default audio output mode is selected for output, for example, when there are two audio output modes of an earphone and a loudspeaker, the earphone is selected by default, and when there are two audio output modes of the earphone and the earphone, the earphone is selected by default. But users may sometimes wish to switch between different audio output modes.

For this reason, when there are at least two audio output modes, the processor 301 may further determine a current audio output mode in response to a switching operation of the audio output modes, and then the audio playing part 303 outputs the mixed output audio data through the determined audio output mode. That is, when there are two audio output modes of the earphone and the speaker, the user can freely select to output the mixed output audio data through the earphone or the speaker, and when there are two audio output modes of the earphone and the headphone, the user can freely select to output the mixed output audio data through the earphone or the headphone.

Fig. 6 is a schematic diagram of a processing procedure of audio data in a terminal according to an embodiment of the present application, where an audio input device of the terminal has a microphone and an audio output device has headphones and a speaker, and an audio object on the terminal has a music application, a WeChat, a sound assistant and a recorder. All audio data in the terminal is managed by the audio stream management service, and the output audio data shown by the solid line in fig. 6 is input audio data shown by the broken line.

Taking a music application as an example, nuPlayer (a streaming media framework) is used for decoding audio data to be output of the music application to obtain pulse code modulation (Pulse Code Modulation, PCM) data; an audio track (AudioTrack) for recording operations such as sound effects, volume increases and decreases, etc. on PCM data; and the audio stream management service is used for carrying out corresponding audio processing on the PCM data based on the recording of the AudioTrack, and then outputting the processed PCM data through a loudspeaker or a headset.

Taking a WeChat as an example, an audioTrack is used for recording operations such as volume increase and decrease and the like of audio data to be output of the WeChat; and the audio stream management service is used for carrying out corresponding audio processing on the audio data to be output of the WeChat based on the record of the AudioTrack, and then outputting the processed audio data to be output through a loudspeaker or a headset.

Taking a sound assistant as an example, the audio stream management service is configured to acquire audio data from a microphone, send the acquired audio data to an audio record (AudioRecord) in real time, record the audio data by the AudioRecord, obtain a real-time audio stream, and send the real-time audio stream to the sound assistant.

Taking a recorder as an example, an audio stream management service is used for acquiring audio data from a microphone, sending the acquired audio data to an AudioRecord in real time, recording the audio data by the AudioRecord to obtain a real-time audio stream, sending the real-time audio stream to a stagefreghtrecorder, encoding the real-time audio stream by the stagefreghtrecorder to obtain a recorded audio stream, and sending the recorded audio stream to the recorder.

The above is exemplified by outputting audio data to be output of a music application from the music application, outputting audio data to be output of a WeChat from the WeChat, and respectively giving audio data to be input acquired from a microphone to a sound assistant and a recorder.

The following describes embodiments of the present application in connection with specific embodiments.

Suppose user a is talkbacking with user B through a talk-back application, user a wants to invite user C to participate in talk-back through a cellular phone. Then, for the terminal used by the user a, its processing procedure for the audio data can be seen in fig. 7.

After the terminal used by the user A obtains the audio data to be output (corresponding to the sound of the user C) of the cellular telephone based on the telephone state monitoring result, the audio data to be output of the cellular telephone can be converted into one path of audio data to be input of the intercom application, the path of audio data to be input and the original path of audio data to be input (corresponding to the sound of the user A) of the intercom application obtained from the microphone are subjected to mixed processing, mixed input audio data is obtained, the mixed input audio data is used as input of the intercom application, and the intercom application only needs to process according to original audio data processing logic, so that the user B can hear the sound of the user A and the sound of the user C through the intercom application. Similarly, the mixed input audio data is used as input to the cellular telephone, which is processed according to the original audio data processing logic, so that the user C can hear the voice of the user A and the voice of the user B through the cellular telephone.

Meanwhile, in order to enable the user A to hear the voice of the user B and the voice of the user C through the intercom application, after the terminal used by the user A obtains the audio data to be output (voice corresponding to the user C) of the cellular phone, the terminal can also obtain the audio data to be output (voice corresponding to the user B) of the intercom application, then, the audio data to be output of the cellular phone and the audio data to be output of the intercom application are subjected to mixing processing to obtain mixed output audio data, the mixed input audio data is used as the output of the intercom application and is output from a receiver or an earphone, and the user A can hear the voice of the user B and the voice of the user C through the intercom application.

In order to achieve the above object, a user may be allowed to configure conversion relationships between audio data to be output of an audio object and audio data to be input of the audio object in a terminal, each conversion relationship including: a source (source) for indicating an audio object from which audio data is to be output, a target (target) for indicating an audio object to which audio data is to be input, a type (type) for indicating whether the audio data is to be input or output, and an application package name. The conversion relationships may include a default conversion relationship and a temporary conversion relationship, the temporary conversion relationship having a higher priority than the default conversion relationship. Both default and temporary conversion relationships may be written in xml files or saved in json format.

Taking an xml file as an example, a certain default conversion relationship recorded in the xml file is as follows:

<stream type＝"voip"

source_io_type＝"output"

source_app＝"com.tencent.mm"

target_app＝"com.android.phone"

target_io_type＝"input"></stream>

the meaning is that the output (io_type= "output") in the voip stream of the Wechat (packet name com.content.mm) is converted into the input (target_io_type= "input") of the cellular phone (target_app= "com.android.phone").

The xml file may include a plurality of conversion relationships that may be loaded when the audio stream management service is enabled, thereby creating a default rule linked list default_rules_list.

If a user wants to invite a certain WeChat friend to talk together in an intercom group during intercom, at the moment, the intercom output needs to be connected to the WeChat input, and the WeChat output is connected to the intercom input to generate two temporary conversion relations, namely:

<stream type＝"voip"

source_io_type＝"output"

source_app＝"com.tencent.mm"

target_app＝"com.hmct.duijiang"

target_io_type＝"input"></stream>

<stream type＝"voip"

source_io_type＝"output"

source_app＝"com.hmct.duijiang"

target_app＝"com.tencent.mm"

target_io_type＝"input"></stream>

and, a temporary rule linked list intjrules_list may be created.

In addition, two flow linked lists may also be created: a first stream chain table for recording an audio input stream (i.e., audio data to be input) and a second stream chain table for recording an audio output stream (i.e., audio data to be output). In practical applications, turning on sound playback creates an audio output stream, turning on sound recording creates an audio input stream, and when a micro-communication call is passed, the audio input stream and the audio output stream are created simultaneously. Any audio input stream or audio output stream is automatically registered to the audio stream management service when being created, so that the audio stream management service knows stream information of the audio stream, such as: stream identification, sink/source, type (audio input stream or audio output stream), application packet information (used to mark from which application the audio stream comes), sampling rate, bit width. The audio stream management service may add audio streams in the first stream linked list or the second stream linked list based on the stream information. When the audio stream is played or recorded, the audio stream management service may delete the corresponding stream information from the first stream linked list or the second stream linked list according to the stream identifier of the audio stream. Thus, the stream information in the first and second stream lists is dynamically updated with the audio playing status and recording status in the terminal.

It should be noted that, the cellular phone is generally independent of Android system sound, for example, the packet name is com. Since the cellular phone defaults to not generating audio streams during a call, the audio input and output streams can be pulled from the call scene through proxy, and the audio input stream is added to the first stream linked list and the audio output stream is added to the second stream linked list.

Subsequently, when a certain audio stream is acquired, a type and an application_package of the audio stream can be acquired from the first stream chain table or the second stream chain table based on the stream identifier, and if the type is an audio output stream, an instance_files_list is searched according to the application_package.

If an audio object taking the audio stream to be output of the application_package as an input exists in the instance_files_list, the audio stream to be output of the application_package can be converted into the audio stream to be input of the audio object, and the converted audio stream to be input is used as one path of input of the audio object to be processed.

If the audio object taking the audio stream to be output of the application_package as the input does not exist in the instance_files, the default_files_files can be searched according to the application_package, and if the audio object taking the audio stream to be output of the application_package as the input exists in the default_files, the audio stream to be output of the application_package can be converted into the audio stream to be input of the audio object, and the converted audio stream to be input is taken as one path of input of the audio object to be processed.

In the above process, if the found audio object originally has one audio stream to be input, the audio stream to be input and the converted audio stream to be input may be mixed to obtain a mixed input audio stream, and the mixed input audio stream is used as the input of the audio object to be processed.

In addition, two device lists may also be established: the method comprises the steps of loading default configuration equipment when the device is started, registering monitoring service after the device is started, managing the monitoring equipment, and respectively updating the input equipment list and the output equipment list when the device is connected or disconnected. When a wired earphone with a microphone is plugged IN, headset_out can be added IN the input device list, and headset_in can be added IN the output device list. And when no earphone of the microphone is plugged in, only the headset_out needs to be added in the input device list. Therefore, the two device lists are dynamically updated along with the change of the audio device in the terminal, and the current input device and output use state can be known according to the two device lists.

Wherein, the equipment is divided into:

the default configuration device, i.e. the device carried by the terminal itself, typically the output device comprises an earpiece and a speaker, the input device comprises a microphone, etc., these devices may be preconfigured by means of a configuration file, and the reading of the configuration file at power-on may be aware of the device carried by the terminal itself.

The dynamic access device is a device which can be dynamically inserted or pulled out after being started, such as a wired earphone (output and microphone), a USB earphone (digital earphone), a Bluetooth earphone (including input and output), a Bluetooth speaker (part of devices only including output), a high-definition multimedia interface (high definition multimedia interface, HDMI) device, a display port (display-port) device, a wireless video display (wifi-display) device, and the like.

In implementation, when the terminal has at least two audio output modes, the user can be allowed to switch between the audio output modes, so as to improve the flexibility of audio data output. And, when it is determined that a certain output device is not used for a specified period of time based on the output device list, this output device may be turned off to save power. Similarly, when it is determined that a certain input device is not used for a specified period of time based on the input device list, this input device may also be turned off to save power.

In the embodiment of the application, a user can set a conversion relation between audio data to be output of an audio object and audio data to be input of the audio object in the terminal so as to change a transmission path of the audio data of one audio object, can set an audio output mode of the audio object, can switch the input or output of the audio object, and can break the input or output of a certain audio object, so that the management of the audio data in the terminal is flexible and comprehensive.

Fig. 8 is a flowchart of an audio control method according to an embodiment of the present application, where the method is applied to a terminal, and the method includes the following steps.

In step S801, audio data to be output of a first audio object is acquired.

Wherein the first audio object may be an application or a cellular phone.

In step S802, it is determined whether there is a second audio object having the audio data to be output of the first audio object as an input, based on the conversion relationship between the audio data to be output of the saved audio object and the audio data to be input of the audio object.

In specific implementation, whether a second audio object taking the audio data to be output of the first audio object as input exists or not can be determined based on a temporary conversion relation between the audio data to be output of the stored audio object and the audio data to be input of the audio object, and if not, whether the second audio object taking the audio data to be output of the first audio object as input exists or not is determined based on a default conversion relation between the audio data to be output of the stored audio object and the audio data to be input of the audio object. So as to avoid the conflict between the temporary conversion relation and the default conversion relation.

In step S803, if it is determined that the second audio object exists, the audio data to be output of the first audio object is converted into the audio data to be input of the second audio object.

In specific implementation, according to the requirement of the second audio object on the input audio data, the audio data to be output of the first audio object can be subjected to conversion processing such as format conversion, sampling rate conversion and the like.

In step S804, the converted audio data to be input is sent to the second audio object for processing.

In specific implementation, the second audio object can process the converted audio data to be input according to the original audio processing logic, that is, the technical scheme of the embodiment of the application does not need to change the audio object, so that the audio data can be transmitted between different audio objects, and implementation is easier.

In step S805, the audio data after the second audio object processing is transmitted to the outside.

That is, the terminal transmits the audio data processed by the second audio object to the other terminal, and the user on the other terminal side can hear the sound from the first audio object through the second audio object.

Fig. 9 is a flowchart of yet another audio control method provided in an embodiment of the present application, where the method is applied to a terminal, and the method includes the following steps.

In step S901, audio data to be output of a first audio object is acquired.

In step S902, it is determined whether or not there is a second audio object having the audio data to be output of the first audio object as an input, based on the conversion relationship between the audio data to be output of the saved audio object and the audio data to be input of the audio object.

In step S903, if it is determined that the second audio object exists, the audio data to be output of the first audio object is converted into the audio data to be input of the second audio object.

In step S904, if it is determined that the second audio object has audio data to be input currently, mixing processing is performed on the current audio data to be input and the converted audio data to be input, so as to obtain mixed input audio data.

In step S905, the mixed input audio data is transmitted to the second audio object for processing.

In step S906, the audio data after the second audio object processing is transmitted to the outside.

In step S907, audio data to be output of the second audio object is acquired.

In step S908, the audio data to be output of the first audio object and the audio data to be output of the second audio object are mixed, so as to obtain mixed output audio data.

In step S909, the mixed output audio data is output.

In specific implementation, for any audio data of the audio data to be output of the first audio object, the audio data to be output of the second audio object, and the mixed output audio data, corresponding control processing may be performed on the audio data in response to a control operation on the audio data, where the control operation includes suspending output, starting output, and ending output.

And when at least two audio output modes exist, the current audio output mode can be determined in response to the switching operation of the audio output modes, and then the mixed output audio data is output through the determined audio output mode.

Based on the same technical concept, the embodiment of the present application further provides an audio control device, and the principle of solving the problem of the audio control device is similar to that of the audio control method, so that the implementation of the audio control device can refer to the implementation of the audio control method, and the repetition is omitted. Fig. 10 is a schematic structural diagram of an audio control device according to an embodiment of the present application, which includes an obtaining module 1001, a determining module 1002, a converting module 1003, a processing module 1004, and a sending module 1005.

An obtaining module 1001, configured to obtain audio data to be output of a first audio object;

a determining module 1002, configured to determine whether a second audio object having the audio data to be output of the first audio object as an input exists based on a conversion relationship between the stored audio data to be output of the audio object and the audio data to be input of the audio object;

a conversion module 1003, configured to convert audio data to be output of the first audio object into audio data to be input of the second audio object if it is determined that the second audio object exists;

the processing module 1004 is configured to send the converted audio data to be input to the second audio object for processing;

a sending module 1005, configured to send the audio data processed by the second audio object to the outside.

In some embodiments, the determining module 1002 is specifically configured to:

In some embodiments, further comprising:

a mixing module 1006, configured to, if it is determined that the second audio object has audio data to be input currently, perform mixing processing on the audio data to be input currently and the audio data to be input obtained by conversion, so as to obtain mixed input audio data;

the processing module 1004 is further configured to send the mixed input audio data to the second audio object for processing.

In some embodiments, mixing module 1006 and output module 1007 are also included:

the obtaining module 1001 is further configured to obtain audio data to be output of the second audio object;

the mixing module 1006 is configured to perform mixing processing on audio data to be output of the first audio object and audio data to be output of the second audio object, so as to obtain mixed output audio data;

the output module 1007 is configured to output the mixed output audio data.

In some embodiments, the apparatus further comprises a control module 1008 for:

a switching module 1009, configured to determine a current audio output mode in response to a switching operation of the audio output mode;

the output module 1007 is specifically configured to output the mixed output audio data through the determined audio output mode.

In this embodiment of the present application, the division of the modules is schematically only one logic function division, and there may be another division manner in actual implementation, and in addition, each functional module in each embodiment of the present application may be integrated in one processor, or may exist separately and physically, or two or more modules may be integrated in one module. The coupling of the individual modules to each other may be achieved by means of interfaces which are typically electrical communication interfaces, but it is not excluded that they may be mechanical interfaces or other forms of interfaces. Thus, the modules illustrated as separate components may or may not be physically separate, may be located in one place, or may be distributed in different locations on the same or different devices. The integrated modules may be implemented in hardware or in software functional modules.

The embodiment of the application also provides a storage medium, and when instructions in the storage medium are executed by a processor of a terminal, the terminal can execute the audio control method related to the foregoing embodiment.

In some possible embodiments, various aspects of the audio control method provided in the present application may also be implemented in the form of a program product, where the program product includes program code for causing a terminal to execute the audio control method as referred to in the foregoing embodiments, when the program product is run on the terminal.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, a RAM, a ROM, an erasable programmable read-Only Memory (EPROM), flash Memory, optical fiber, compact disc read-Only Memory (Compact Disk Read Only Memory, CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The program product for audio control in embodiments of the present application may take the form of a CD-ROM and include program code that can run on a computing device. However, the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, radio Frequency (RF), etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In cases involving remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, such as a local area network (Local Area Network, LAN) or wide area network (Wide Area Network, WAN), or may be connected to an external computing device (e.g., connected over the internet using an internet service provider).

It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such a division is merely exemplary and not mandatory. Indeed, the features and functions of two or more of the elements described above may be embodied in one element in accordance with embodiments of the present application. Conversely, the features and functions of one unit described above may be further divided into a plurality of units to be embodied.

Furthermore, although the operations of the methods of the present application are depicted in the drawings in a particular order, this is not required to or suggested that these operations must be performed in this particular order or that all of the illustrated operations must be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims

1. A terminal, comprising:

a processor configured to:

when an audio stream is acquired, inquiring the type and the application packet name of the audio stream from a first stream chain table or a second stream chain table based on the stream identification of the audio stream, wherein the first stream chain table is used for recording stream information of an audio input stream, the second stream chain table is used for recording stream information of an audio output stream, and the stream information in the first stream chain table and the second stream chain table is updated along with the audio playing state and the recording state in the terminal;

if the type of the audio stream is an audio output stream, searching whether an audio object taking the audio output stream of the application packet name of the audio stream as input exists in a temporary rule linked list; if not, searching whether an audio object taking an audio output stream of the audio stream application packet name as an input exists in a default rule linked list, wherein any rule linked list in the temporary rule linked list and the default rule linked list comprises: a source for indicating an audio object from which the audio output stream is coming, a target for indicating an audio object to which the audio input stream is going, a type for indicating whether the audio input stream or the audio output stream;

Converting the audio stream into an audio input stream of the audio object when it is determined that there is an audio object input with an audio output stream of an application packet name of the audio stream;

transmitting the converted audio input stream to the audio object for processing;

and the sending component is configured to send the audio data processed by the audio object outwards.

2. The terminal of claim 1, wherein the processor is further configured to:

if the audio object is determined to have the audio input stream, mixing the audio input stream of the audio object and the audio input stream obtained by conversion to obtain a mixed audio input stream;

and sending the mixed audio input stream to the audio object for processing.

3. The terminal of claim 1, wherein the processor is further configured to:

mixing the obtained audio output stream of the audio object and the audio output stream from the same audio object with the audio stream to obtain a mixed audio output stream;

further comprises:

an audio playing component configured to output the mixed audio output stream.

4. The terminal of claim 3, wherein the processor is further configured to:

And responding to the control operation of the audio output stream to perform corresponding control processing on the audio output stream aiming at any one of the acquired audio output stream of the audio object, the audio output stream of the same audio object as the audio stream and the mixed audio output stream, wherein the control operation comprises output suspension, output starting and output ending.

5. The terminal of claim 3, wherein if there are at least two audio output modes, the processor is further configured to:

the audio playing component is specifically configured to output the mixed audio output stream through a determined audio output mode.

6. The terminal of claim 1, wherein when the audio stream is from an application, the audio object is an application different from the application or is a cellular phone; when the audio stream is from a cellular telephone, the audio object is an application.

7. An audio control method, comprising:

when an audio stream is acquired, based on the stream identification of the audio stream, inquiring the type and the application packet name of the audio stream from a first stream chain table or a second stream chain table, wherein the first stream chain table is used for recording an audio input stream, the second stream chain table is used for recording an audio output stream, and stream information in the first stream chain table and the second stream chain table is updated along with the audio playing state and the recording state in a terminal;

If the type of the audio stream is an audio output stream, searching whether an audio object taking the audio output stream of the application packet name of the audio stream as input exists in a temporary rule linked list;

if not, searching whether an audio object taking the audio output stream of the application packet name of the audio stream as input exists from a default rule linked list;

sending the audio data processed by the audio object outwards;

wherein any one of the temporary rule linked list and the default rule linked list includes: a source for indicating an audio object from which the audio output stream is coming, a target for indicating an audio object to which the audio input stream is going, a type for indicating whether the audio input stream or the audio output stream, and an application packet name.

8. A storage medium, characterized in that the terminal is capable of performing the method of claim 7 when instructions in the storage medium are executed by a processor of the terminal.