WO2024103418A1 - Procédé et dispositif de commande audio - Google Patents

Procédé et dispositif de commande audio Download PDF

Info

Publication number
WO2024103418A1
WO2024103418A1 PCT/CN2022/133026 CN2022133026W WO2024103418A1 WO 2024103418 A1 WO2024103418 A1 WO 2024103418A1 CN 2022133026 W CN2022133026 W CN 2022133026W WO 2024103418 A1 WO2024103418 A1 WO 2024103418A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
audio stream
duration
type
focus
Prior art date
Application number
PCT/CN2022/133026
Other languages
English (en)
Chinese (zh)
Inventor
全超
王雅莉
王键
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2022/133026 priority Critical patent/WO2024103418A1/fr
Publication of WO2024103418A1 publication Critical patent/WO2024103418A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/414Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones

Definitions

  • the present application relates to the field of audio processing technology, and in particular to an audio control method and device.
  • Audio focus is a virtual control point.
  • the application can play audio only after obtaining audio focus.
  • the audio playback of the application stops. That is, by controlling the occupation and release of audio focus, it can be ensured that only one application can maintain audio focus in the same period of time.
  • Audio focus includes two types: long focus and short focus.
  • the long focus type means that the audio will be played for a long time, and the audio focus will be occupied for a long time.
  • the audio sink device uses the audio from the audio source device, the audio sink device defaults to allocating a fixed type of audio focus (or a fixed long focus, or a fixed short focus) to the audio from the audio source device, resulting in unreasonable allocation of audio service resources of the audio sink, affecting the user experience.
  • the present application provides an audio control method and device, which can reasonably allocate audio service resources and improve user experience.
  • the present application provides an audio control method, the method comprising: a music sink device receives attribute information of a first audio stream from a sound source device; the music sink device determines an audio focus type of the first audio stream based on the attribute information, the audio focus type being a long focus type or a short focus type.
  • the aforementioned attribute information indicates one or more of the following: a first duration of the first audio stream, a first type of the aforementioned first duration, or a priority of the aforementioned first audio stream.
  • the attribute information indicates a first type of a first duration of the first audio stream.
  • the method further includes: the music sink device sends first information to the sound source device; the first information includes the duration of one or more audio streams stored in the music sink device, and the duration of the one or more audio streams is used to determine the first type.
  • the first information may be carried in a local audio stream attribute read response message sent by the music sink device to the music source device and sent to the music source device.
  • the audio sink device can send the local audio stream environment information of the audio sink device (including the duration of the above one or more audio streams) to the audio source device.
  • the audio source device can determine a more reasonable duration type for the audio stream based on the local audio stream environment information of the audio sink device.
  • the audio sink device can reasonably determine the audio focus type of the audio stream based on the type of duration of the audio stream.
  • the determined audio focus type is matched with the audio duration of the audio stream, thereby achieving reasonable allocation of audio service resources (such as playback resources) and improving user experience.
  • the duration of the one or more audio streams is stored in an audio focus management stack of the audio sink device.
  • the audio sink device can store the duration of the local audio stream in the audio focus management stack for quick retrieval.
  • the aforementioned attribute information indicates a first type of the first duration of the aforementioned first audio stream; when the first type is a short audio duration type, the audio focus type of the first audio stream is a short focus type; or, when the first type is a long audio duration type, the audio focus type of the first audio stream is a long focus type.
  • the audio sink device can reasonably determine the audio focus type of the audio stream based on the type of duration of the audio stream, so that the determined audio focus type matches the audio duration of the audio stream, thereby achieving reasonable allocation of audio service resources (such as playback resources) and improving user experience.
  • the aforementioned attribute information indicates a first duration of the aforementioned first audio stream; when the aforementioned first duration is less than a first threshold, the audio focus type of the first audio stream is a short focus type; or, when the aforementioned first duration is greater than the aforementioned first threshold, the audio focus type of the first audio stream is a short focus type.
  • the audio sink device can reasonably determine the audio focus type of the audio stream based on the duration of the audio stream, so that the determined audio focus type matches the audio duration of the audio stream, thereby achieving reasonable allocation of audio service resources (such as playback resources) and improving user experience.
  • the music sink device receives the attribute information of the first audio stream from the sound source device, including: the music sink device receives the attribute information of the first audio stream from the sound source device during an audio stream configuration or reconfiguration phase; or the music sink device receives the attribute information of the first audio stream from the sound source device during an audio stream transmission phase.
  • an audio occupancy field can be added to the audio stream configuration message or the audio stream transmission message to transmit the attribute information of the audio stream, so that there is no need to send another message to transmit the attribute information of the audio stream, thereby saving transmission resources and improving communication efficiency.
  • the method further includes: the audio sink device determining, based on the attribute information, a storage space allocated to the first audio stream and/or a processing priority of the first audio stream.
  • the storage space of the audio stream is determined based on the attribute information of the audio stream, so that storage resources can be reasonably allocated and the waste of storage resources can be reduced.
  • the processing priority of the audio stream is determined based on the attribute information of the audio stream, so that processing resources can be reasonably allocated while processing the audio stream in a timely manner, thereby improving the processing efficiency of the system.
  • the music sink device determines the storage space allocated for the first audio stream based on the attribute information, including: when the attribute information indicates that a first duration of the first audio stream is greater than a second threshold, the music sink device allocates a first storage space for the first audio stream; the duration of the audio stream that can be stored in the first storage space is less than the first duration, and the first storage space is used to cyclically store unused audio streams in the first audio stream.
  • the present application provides an audio control method, the method comprising: the aforementioned music host device assigns an audio focus to a first audio stream in the form of a short focus; the aforementioned music host device determines that the time length for which the aforementioned first audio stream occupies the aforementioned audio focus is greater than a threshold; the aforementioned music host device changes the audio focus type of the aforementioned first audio stream from a short focus to a long focus.
  • the aforementioned audio sink device allocates the audio focus to the first audio stream in the form of a short focus, it also includes: the aforementioned audio sink device receives an audio stream transmission request for the aforementioned first audio stream from the aforementioned audio source device; and the aforementioned audio stream transmission request does not indicate attribute information of the aforementioned first audio stream.
  • the method further includes: the music sink device sends an audio stream transmission change event of the first audio stream to the audio source device.
  • the audio host device dynamically applies for a corresponding type of audio focus for the audio stream based on the duration of the audio focus occupied by the audio stream, so that the type of audio focus applied for matches the audio duration of the audio stream, thereby achieving reasonable allocation of audio service resources (such as playback resources) and improving user experience.
  • the present application provides an audio control method, the method comprising: a sound source device obtains attribute information of a first audio stream; the sound source device sends the attribute information to a sound destination device; and the attribute information is used to determine an audio focus type of the first audio stream.
  • the aforementioned attribute information indicates one or more of the following: a first duration of the first audio stream, a first type of the aforementioned first duration, or a priority of the aforementioned first audio stream.
  • the audio source device can send audio stream attribute information to the audio sink device.
  • the audio sink device can determine the audio focus type of the audio stream based on the audio stream attribute information. The determined audio focus type is matched with the audio duration of the audio stream, thereby achieving reasonable allocation of audio service resources (such as playback resources) and improving user experience.
  • the attribute information indicates a first type of a first duration of the first audio stream; before the audio source device obtains the attribute information of the first audio stream, the method further includes:
  • the audio source device receives first information from the audio sink device; the first information includes the duration of one or more audio streams stored in the audio sink device;
  • the audio source device determines a second duration based on the duration of the one or more audio streams.
  • the sound source device determines the first type based on the second duration and the first duration.
  • the aforementioned first type when the aforementioned first duration is less than the aforementioned second duration, the aforementioned first type is a short audio duration type; or, when the aforementioned first duration is greater than the aforementioned second duration, the aforementioned first type is a long audio duration type.
  • the sound source device can receive local audio stream environment information (including the duration of one or more audio streams mentioned above) from the sound sink device. This allows the sound source device to determine a more reasonable duration type for the audio stream based on the local audio stream environment information of the sound sink device. Furthermore, after the audio stream is transmitted to the sound sink device, the sound sink device can reasonably determine the audio focus type of the audio stream based on the type of duration of the audio stream. The determined audio focus type is matched with the audio duration of the audio stream, thereby achieving reasonable allocation of audio service resources (such as playback resources) and improving user experience.
  • local audio stream environment information including the duration of one or more audio streams mentioned above
  • the aforementioned sound source device before the aforementioned sound source device obtains the attribute information of the first audio stream, it also includes: the aforementioned sound source device determines the attribute information of the aforementioned first audio stream based on the associated information of the application to which the aforementioned first audio stream belongs; the associated information of the aforementioned application includes the audio play list, audio play mode or historical play data of the aforementioned application.
  • the audio source device can estimate the duration of the audio stream based on the associated information of the application to which the audio stream belongs, through historical data statistics or artificial intelligence (AI) analysis, etc. This allows the subsequent audio sink device to more reasonably determine the audio focus type of the audio stream based on the duration of the audio stream.
  • AI artificial intelligence
  • the step further includes: the sound source device determines the storage space allocated to the first audio stream and/or the processing priority of the first audio stream based on the attribute information.
  • the storage space of the audio stream is determined based on the attribute information of the audio stream, so that storage resources can be reasonably allocated and the waste of storage resources can be reduced.
  • the processing priority of the audio stream is determined based on the attribute information of the audio stream, so that processing resources can be reasonably allocated while processing the audio stream in a timely manner, thereby improving the processing efficiency of the system.
  • the present application provides a music sink device, the music sink device comprising:
  • a receiving unit configured to receive attribute information of a first audio stream from an audio source device
  • a processing unit is used to determine the audio focus type of the first audio stream based on the attribute information, where the audio focus type is a long focus type or a short focus type.
  • the attribute information indicates one or more of the following: a first duration of the first audio stream, a first type of the first duration, or a priority of the first audio stream.
  • the attribute information indicates a first type of a first duration of the first audio stream
  • the music sink device further includes a sending unit, configured to send first information to the sound source device before the receiving unit receives the attribute information of the first audio stream from the sound source device; the first information includes the duration of one or more audio streams stored in the music sink device, and the duration of the one or more audio streams is used to determine the first type.
  • the sending unit is specifically configured to send a local audio stream attribute reading response message to the audio source device, wherein the local audio stream attribute reading response message includes the first information.
  • the duration of the one or more audio streams is stored in an audio focus management stack of the audio sink device.
  • the aforementioned attribute information indicates a first type of a first duration of the aforementioned first audio stream; when the aforementioned first type is a short audio duration type, the audio focus type of the aforementioned first audio stream is a short focus type; or, when the aforementioned first type is a long audio duration type, the audio focus type of the aforementioned first audio stream is a long focus type.
  • the aforementioned attribute information indicates a first duration of the aforementioned first audio stream; when the aforementioned first duration is less than a first threshold, the audio focus type of the first audio stream is a short focus type; or, when the aforementioned first duration is greater than the aforementioned first threshold, the audio focus type of the first audio stream is a short focus type.
  • the aforementioned receiving unit is specifically used for:
  • an audio stream transmission message is received from the audio source device, wherein the audio occupancy field of the audio stream transmission message includes attribute information of the first audio stream.
  • the processing unit is further configured to determine a storage space allocated to the first audio stream and/or a processing priority of the first audio stream based on the attribute information.
  • the processing unit is specifically used to: allocate a first storage space for the first audio stream when the attribute information indicates that the first duration of the first audio stream is greater than a second threshold; the duration of the audio stream that can be stored in the first storage space is less than the first duration, and the first storage space is used to cyclically store unused audio streams in the first audio stream.
  • the present application provides a music sink device, the music sink device comprising:
  • a processing unit is used to allocate the audio focus in the form of short focus to the first audio stream based on the audio stream transmission request; determine that the time length of the first audio stream occupying the audio focus is greater than a threshold; and change the audio focus type of the first audio stream from short focus to long focus.
  • the aforementioned audio destination device further includes a receiving unit, configured to receive an audio stream transmission request for the aforementioned first audio stream from the aforementioned audio source device before the processing unit allocates the audio focus to the first audio stream in the form of a short focus; the aforementioned audio stream transmission request does not indicate attribute information of the aforementioned first audio stream.
  • the audio sink device further includes a sending unit configured to send an audio stream transmission change event of the first audio stream to the audio source device after the processing unit assigns the audio focus to the first audio stream in the form of a short focus.
  • the present application provides a sound source device, the sound source device comprising:
  • a processing unit configured to obtain attribute information of a first audio stream
  • the sending unit is used to send the audio stream transmission request to the audio sink device; the attribute information is used to determine the audio focus type of the first audio stream.
  • the attribute information indicates one or more of the following: a first duration of the first audio stream, a first type of the first duration, or a priority of the first audio stream.
  • the attribute information indicates a first type of a first duration of the first audio stream
  • the audio source device further includes a receiving unit, configured to receive first information from the audio sink device before the processing unit obtains the attribute information of the first audio stream; the first information includes the duration of one or more audio streams stored in the audio sink device;
  • the processing unit is further configured to determine a second duration based on the duration of the one or more audio streams; and determine the first type based on the second duration and the first duration.
  • the aforementioned first type when the aforementioned first duration is less than the aforementioned second duration, the aforementioned first type is a short audio duration type; or, when the aforementioned first duration is greater than the aforementioned second duration, the aforementioned first type is a long audio duration type.
  • the processing unit before acquiring the attribute information of the first audio stream, is further used to determine the attribute information of the first audio stream based on the associated information of the application to which the first audio stream belongs; the associated information of the application includes the audio play list, audio play mode or historical play data of the application.
  • the processing unit is further configured to determine, after acquiring the attribute information of the first audio stream, a storage space allocated to the first audio stream and/or a processing priority of the first audio stream based on the attribute information.
  • the present application provides a music sink device, which includes a processor and a memory.
  • the memory is coupled to the processor, and when the processor executes a computer program or computer instruction stored in the memory, the method described in any one of the first aspects above can be implemented.
  • the music sink device may also include a communication interface, which is used for the music sink device to communicate with other devices (such as a sound source device).
  • the communication interface may be a transceiver, circuit, bus, module, or other type of communication interface.
  • the audio sink device may include:
  • Memory for storing computer programs or computer instructions
  • the processor is used to: receive attribute information of a first audio stream from a sound source device through a communication interface; and determine an audio focus type of the first audio stream based on the attribute information, wherein the audio focus type is a long focus type or a short focus type.
  • the processor executes a computer program or a computer instruction stored in the memory, the method described in any one of the second aspects may be implemented.
  • the processor is configured to:
  • Allocate audio focus to the first audio stream in the form of short focus determine that the duration for which the first audio stream occupies the audio focus is greater than a threshold; and change the audio focus type of the first audio stream from short focus to long focus.
  • the computer program or computer instruction in the memory in the present application can be stored in advance or downloaded from the Internet when the audio host device is used, and the source of the computer program or computer instruction in the memory is not specifically limited in the present application.
  • the coupling in the embodiment of the present application is an indirect coupling or connection between units or modules, which can be electrical, mechanical or other forms, and is used for information exchange between units or modules.
  • the present application provides a sound source device, the sound source device comprising a processor and a memory.
  • the memory is coupled to the processor, and when the processor executes a computer program or computer instruction stored in the memory, the method described in any one of the third aspects can be implemented.
  • the sound source device may also include a communication interface, the communication interface is used for the sound source device to communicate with other devices (such as a sound sink device), and illustratively, the communication interface may be a transceiver, circuit, bus, module or other type of communication interface.
  • the sound source device may include:
  • Memory for storing computer programs or computer instructions
  • the processor is used to: obtain attribute information of the first audio stream; send the attribute information to the audio host device through the communication interface; and the attribute information is used to determine the audio focus type of the first audio stream.
  • the computer program or computer instruction in the memory in the present application can be pre-stored or downloaded from the Internet and stored when the audio source device is used.
  • the present application does not specifically limit the source of the computer program or computer instruction in the memory.
  • the coupling in the embodiment of the present application is an indirect coupling or connection between units or modules, which can be electrical, mechanical or other forms, and is used for information exchange between units or modules.
  • an embodiment of the present application provides an audio communication system, the system comprising a music sink device and a music source device; wherein the music sink device is used to execute the method described in any one of the first or second aspects, and the music source device is used to execute the method described in any one of the third aspects.
  • an embodiment of the present application provides a computer-readable storage medium, which stores a computer program or computer instructions, and the aforementioned computer program or computer instructions are executed by a processor to implement any method described in the first aspect above.
  • an embodiment of the present application provides a computer-readable storage medium, which stores a computer program or computer instructions, and the aforementioned computer program or computer instructions are executed by a processor to implement any method described in the second aspect above.
  • an embodiment of the present application provides a computer-readable storage medium, which stores a computer program or computer instructions, and the aforementioned computer program or computer instructions are executed by a processor to implement the method described in any one of the above-mentioned third aspects.
  • an embodiment of the present application provides a computer program product.
  • the computer program product is executed by a processor, the method described in any one of the above-mentioned first aspects will be implemented.
  • an embodiment of the present application provides a computer program product.
  • the computer program product is executed by a processor, the method described in any one of the above-mentioned second aspects will be implemented.
  • an embodiment of the present application provides a computer program product.
  • the computer program product is executed by a processor, the method described in any one of the above third aspects will be implemented.
  • Figures 1 and 2 are schematic diagrams of an audio stream applying for audio focus
  • FIG3 is a schematic diagram of the system architecture
  • FIG4 is a schematic diagram of the interaction process between the audio source device and the audio sink device
  • 5 to 7A are schematic flow charts of an audio control method provided in an embodiment of the present application.
  • FIG 8 to 12 are schematic diagrams of the structures of the devices provided in the embodiments of the present application.
  • multiple refers to two or more.
  • “and/or” is used to describe the association relationship of the associated object, indicating three relationships that can exist independently, for example, A and/or B, which can be expressed as: A exists alone, B exists alone, or A and B exist at the same time.
  • the description methods such as "at least one (or at least one) of a1, a2, ... and an” used in the embodiment of the present application include the situation that any one of a1, a2, ... and an exists alone, and also include any combination of any multiple of a1, a2, ...
  • the description method of "at least one of a, b and c" includes the situation that a is alone, b is alone, c is alone, a and b are combined, a and c are combined, b and c are combined, or a, b, c are combined.
  • Audio source a device that generates and/or sends audio streams.
  • Audio host A device that receives and uses (including playing) audio streams.
  • Audio focus is a virtual control point. It is a concept introduced in the audio host to avoid mixing caused by playing two audio streams at the same time. The audio stream needs to obtain the audio focus before it can play the audio. When the audio focus is lost, the audio stream needs to stop playing.
  • Long focus means that the audio will be played for a long time, and the audio focus will be occupied for a long time.
  • the audio stream currently holding the audio focus permanently loses the focus. And after the audio stream releases the focus, the audio stream that loses the focus cannot regain the focus.
  • Short focus type means that the audio will only be played briefly, and the audio focus will be released quickly.
  • the audio stream currently holding the audio focus temporarily loses focus. After the new audio stream releases the focus, the audio stream that lost the focus will regain the focus.
  • Audio focus management stack It is a data structure used to manage audio focus in the audio host system. Each element in the stack represents an application and the resources it occupies. The top element of the stack is the application that holds the audio focus. When a new application applies for long focus, the elements in the stack will be removed, the new application will be pushed into the stack, and other applications will permanently lose the audio focus, as shown in Figure 1. When a new application applies for short focus, the new application will be pushed into the stack as the top element of the stack and hold the audio focus. When the new application releases the audio focus, the new application will be popped out of the stack and the audio focus will be transferred to the new top element of the stack, as shown in Figure 2.
  • the above application can be any application that can have audio playback requirements, such as various music applications, audiobook applications, reading applications or telephone applications, etc.
  • the application can be, for example, an application that implements wireless communication or wired communication between the audio host device and other devices (such as audio source devices).
  • the application can be, for example, a Spark Link communication application on the audio host device, a long-term evolution (LTE) network communication application, a fifth-generation mobile communication technology (5th Generation Mobile Communication Technology, 5G) communication application, a wireless local area network (for example, Wi-Fi) communication application, a Bluetooth (BT) communication application, a Zigbee communication application or a vehicle-mounted short-range wireless communication application, etc.
  • LTE long-term evolution
  • 5G fifth-generation mobile communication technology
  • Wi-Fi wireless local area network
  • BT Bluetooth
  • Zigbee communication application a Zigbee communication application or a vehicle-mounted short-range wireless communication application, etc.
  • the application can be an application that can be downloaded and installed in an
  • the application can apply for an audio focus for the audio stream waiting to be played.
  • the audio focus can be represented by other names, and all objects that can implement the function corresponding to the above audio focus belong to the audio focus described in the embodiment of the present application, that is, the embodiment of the present application does not limit the name of the object that implements the function.
  • the embodiment of the present application takes the object as an example to introduce.
  • FIG. 3 shows a schematic diagram of the system architecture of an embodiment of the present application.
  • the system architecture includes a sound source device 310 and a sound sink device 320.
  • Wired communication or wireless communication can be achieved between the sound source device 310 and the sound sink device 320.
  • the wireless communication between the sound source device 310 and the sound sink device 320 can be achieved through wireless communication technologies such as Spark Link technology, long term evolution (LTE) network technology, fifth generation mobile communication technology (5G), wireless local area network (e.g., Wi-Fi) technology, Bluetooth (BT) technology, Zigbee technology or vehicle-mounted short-range wireless communication network technology.
  • LTE long term evolution
  • 5G fifth generation mobile communication technology
  • Wi-Fi wireless local area network
  • BT Bluetooth
  • the audio source device 310 may send an audio stream to the audio sink device 320, and the audio sink device 320 may receive and use (eg, play, etc.) the audio stream.
  • the above-mentioned sound source device 310 may include, but is not limited to, any electronic product based on an intelligent operating system, which can interact with a user through input devices such as a keyboard, a virtual keyboard, a touchpad, a touch screen, and a voice control device.
  • an intelligent operating system such as a smart phone, a tablet personal computer (Tablet PC), a handheld computer, a wearable electronic device, a personal computer (personal computer, PC), a television or a car device, etc.
  • the intelligent operating system includes, but is not limited to, any operating system that enriches the functions of a device by providing various applications to the device, such as Android, IOS, Windows, MAC or Harmony OS.
  • the above-mentioned audio host device 320 may include, but is not limited to, any electronic product based on an intelligent operating system, which can interact with a user through input devices such as a keyboard, a virtual keyboard, a touch pad, a touch screen, and a voice control device, such as a smart phone, a tablet computer (tablet personal computer, Tablet PC), a handheld computer, a wearable electronic device, a personal computer (personal computer, PC), a television, a car device, a headset or a speaker, etc.
  • a smart phone such as a keyboard, a virtual keyboard, a touch pad, a touch screen, and a voice control device, such as a smart phone, a tablet computer (tablet personal computer, Tablet PC), a handheld computer, a wearable electronic device, a personal computer (personal computer, PC), a television, a car device, a headset or a speaker, etc.
  • FIG. 4 shows a schematic diagram of the flow of interaction between the sound source device 310 and the sound sink device 320.
  • the interaction between the sound source device 310 and the sound sink device 320 may include but is not limited to service discovery, audio attribute acquisition, audio stream configuration or reconfiguration, audio stream transmission channel opening or closing, audio stream transmission or stopping, and audio stream release.
  • a communication connection has been established between the sound source device 310 and the sound sink device 320 before the interaction process shown in FIG. 4 is implemented.
  • the communication connection can be, for example, a wireless communication connection.
  • the specific implementation of the interaction process shown in FIG. 4 is exemplarily described below.
  • the above-mentioned sound source device 310 and the sound sink device 320 may first perform a service discovery operation.
  • the main purpose of the service discovery operation is to allow the sound source device 310 to discover the audio services (for example, including audio stream management services and audio attribute disclosure services) that the sound sink device 320 can provide.
  • the sound source device 310 can obtain the structural member information of the audio stream management service and the audio attribute disclosure service in the sound sink device 320 to learn about the services that the sound sink device 320 can provide.
  • the structural member information can be understood as a kind of description information used to represent these services.
  • Audio attribute acquisition After the audio source device 310 acquires the structural member information of the audio attribute public service in the audio sink device 320, it can further acquire specific information in the audio attribute public service from the audio sink device 320 to learn the audio processing capability (such as audio codec capability, etc.) of the audio sink device 320.
  • Audio stream configuration or reconfiguration After the audio source device 310 learns the audio processing capability of the audio sink device 320, it can configure the audio stream parameters within the capability range given by the audio sink device 320. For example, assuming that the audio sink device 320 can support multiple encoders, the audio source device 310 can select one of the multiple encoders as the encoder of the audio stream transmitted between the audio source device 310 and the audio sink device 320. Optionally, the audio stream parameters between the audio source device 310 and the audio sink device 320 can be reconfigured to determine more reasonable audio stream parameters.
  • Opening or closing the audio stream transmission channel After the audio stream parameters are configured between the audio source device 310 and the audio sink device 320, the audio source device 310 and the audio sink device 320 can interactively negotiate to establish a transmission channel for transmitting the audio stream. In addition, after the audio stream transmission is completed, the audio stream transmission channel between the audio source device 310 and the audio sink device 320 can be closed.
  • Audio stream transmission or stop After the audio stream transmission channel is established, the audio source device 310 can transmit the audio stream to the audio sink device 320. After the audio source device 310 completes the transmission of the audio stream to the audio sink device 320, the transmission of the audio stream can be stopped. Alternatively, the connection between the audio source device 310 and the audio sink device 320 may be suddenly interrupted, resulting in the cessation of the transmission of the audio stream.
  • Audio stream release After the above audio stream transmission is completed or stopped, the audio source device 310 and the audio sink device 320 can release the resources occupied by the audio stream (such as link resources at the access layer) and clear the configured related parameters.
  • Audio focus includes two types: long focus and short focus. Long focus is suitable for audio streams with longer audio durations, while short focus is suitable for audio streams with shorter audio durations.
  • Long focus is suitable for audio streams with longer audio durations
  • short focus is suitable for audio streams with shorter audio durations.
  • the system of the audio host device applies for a single type of audio focus by default, resulting in the inability to reasonably allocate audio service resources, affecting the user experience.
  • an embodiment of the present application provides an audio control method.
  • FIG5 shows a flowchart of an audio control method provided by an embodiment of the present application.
  • the method includes but is not limited to the following steps:
  • the audio source device sends attribute information of a first audio stream to the audio sink device.
  • the sound source device may be, for example, the sound source device 310 shown in Fig. 3.
  • the sound sink device may be, for example, the sound sink device 320 shown in Fig. 3.
  • the attribute information of the first audio stream may be carried in an audio stream transmission request and sent to the audio sink device. That is, the audio source device sends an audio stream transmission request to the audio sink device, and the audio stream transmission request includes the attribute information.
  • the above-mentioned audio stream transmission request may be a message sent by the audio source device to the audio sink device after the audio stream transmission channel between the audio source device and the audio sink device is established, requesting the transmission of the audio stream. The message is used to inform the audio sink device that the audio source device is ready to send an audio stream to the audio sink device, so that the audio sink device can be prepared to receive the audio stream.
  • the preparation may, for example, be applying for an audio focus and storage space (for example, a buffer space) for the received audio stream.
  • an audio focus and storage space for example, a buffer space
  • the establishment of the audio stream transmission channel can be exemplified by referring to the description of the audio stream transmission channel opening step in Fig. 4.
  • the audio stream transmission request can be, for example, a message sent in the audio stream transmission phase in Fig. 4.
  • the attribute information of the first audio stream may be carried in an audio stream configuration request and sent to the audio sink device.
  • the attribute information of the first audio stream may be carried in an audio stream configuration response message and sent to the audio sink device.
  • the audio stream configuration response message may be a response message sent by the audio source device to the audio sink device based on the audio stream configuration message sent by the audio sink device. The following description is taken as an example that the attribute information of the first audio stream is carried in the audio stream configuration request. That is, the audio source device sends an audio stream configuration request to the audio sink device, and the audio stream configuration request includes the attribute information.
  • the above-mentioned audio stream configuration request may be a message sent by the audio source device to the audio sink device to request the implementation of the audio stream configuration after obtaining specific information in the audio attribute disclosure service of the audio sink device.
  • the message is used to inform the audio sink device to determine the configured audio stream parameters (such as encoding and decoding parameters, transmission mode parameters or transparent transmission mode parameters, etc.).
  • configured audio stream parameters such as encoding and decoding parameters, transmission mode parameters or transparent transmission mode parameters, etc.
  • the specific information in the above-mentioned audio attribute disclosure service of the audio sink device can be exemplified by referring to the relevant description of the audio attribute acquisition step in Figure 4.
  • the above-mentioned audio stream configuration request can be, for example, a message sent in the audio stream configuration or reconfiguration phase in Figure 4.
  • the attribute information of the first audio stream may include one or more of the following: a first duration of the first audio stream, a first type of the first duration, or a priority of the first audio stream.
  • the duration of the audio stream may be represented by a value obtained by multiplying the inverse of the sampling rate of the audio stream (ie, the sampling period) by the number of sampling points of the audio stream.
  • the types of duration of an audio stream may include a short audio duration type and a long audio duration type. If the duration of an audio stream is less than or equal to a certain threshold (referred to as threshold 1), it belongs to the short audio duration type. If the duration of an audio stream is greater than or equal to a certain threshold (referred to as threshold 2), it belongs to the long audio duration type.
  • the threshold 1 and the threshold 2 may be equal or unequal. If they are unequal, the threshold 2 is greater than the threshold 1.
  • the threshold 1 and the threshold 2 may be set according to actual implementation, and the embodiments of the present application do not limit this.
  • the priority of an audio stream may include two types: high priority and low priority.
  • the priority of the audio stream may be determined based on the application to which the audio stream belongs. For example, if the audio stream comes from an application with a high probability of playing audio for a long time, such as a phone application, a music player application, or a book application, then the priority of the audio stream is a high priority. On the contrary, if the audio stream comes from an application with a high probability of playing audio (message prompt tone) for a short time, such as a text message application or a message notification application, then the priority of the audio stream is a low priority.
  • the priority of the audio stream may be determined based on the duration of the audio stream, and its specific implementation may refer to the determination of the type of duration of the above-mentioned audio stream. Then, if the duration of the audio stream is less than a certain threshold, the priority of the audio stream is a low priority. If the duration of the audio stream is greater than a certain threshold, the priority of the audio stream is a high priority. If the duration of the audio stream is equal to the certain threshold, the priority of the audio stream may be a low priority or a high priority. It is to be understood that this is only an example and does not constitute a limitation to the embodiments of the present application.
  • the attribute information of the first audio stream includes the first duration.
  • the first duration cannot be calculated by the sampling rate and the number of sampling points of the first audio stream, for example, the duration of a real-time audio stream that depends on user operations such as music or calls is uncertain.
  • the sound source device can determine the duration of the first audio stream based on the associated information of the application to which the first audio stream belongs.
  • the associated information of the application may include an audio playlist, an audio play mode, or historical play data of the application. For example, it is assumed that the application to which the first audio stream belongs is a music application or a telephone application, etc.
  • the duration of the first audio stream is estimated based on the associated information of the application by means of historical data statistics or artificial intelligence (AI) analysis.
  • AI artificial intelligence
  • the attribute information of the first audio stream includes the first type.
  • the first type is determined based on the duration of one or more audio streams stored in the audio sink device.
  • the audio sink device may send the duration of the one or more audio streams to the audio source device.
  • the audio source device may first send a request message to the audio sink device to obtain the local audio stream attributes of the audio sink device (see S501A).
  • the request message may be a read command (SSAP_READ_REQ (local audio stream attributes)) message for the local audio stream attributes.
  • the audio sink device may obtain information about one or more locally stored audio streams based on the request message.
  • the information about the one or more audio streams includes the duration of the one or more audio streams.
  • the information about the one or more audio streams may also include information such as the number of the one or more audio streams and the type of the one or more audio streams.
  • the information of the one or more audio streams belongs to the attributes of the local audio stream environment in the audio sink device.
  • the attributes of the local audio stream environment may be designed and stored in the audio stream management service in the audio sink device. For example, see Table 1, which exemplarily lists the attributes of the local audio stream environment.
  • the local audio stream environment attribute of the audio host device can be obtained by reading from the audio source device or receiving a notification.
  • the number of audio streams included in the local audio stream environment attribute can occupy 1 byte, the audio stream type occupies 2 bytes, and the duration of the audio stream occupies 3 bytes.
  • the local audio stream environment attribute can support one or more of the following permissions: authentication authority, encryption authority, or authorization authority.
  • the authentication authority supports the need for authentication or the need for encryption
  • the encryption authority supports the need for encryption or the need for encryption
  • the authorization authority supports the need for authorization or the need for authorization.
  • the specific definition of the one or more permissions can be specified by the audio use case standard, and the embodiments of the present application are not limited.
  • the above-mentioned “required” option indicates that the local audio stream environment attribute is an attribute that must exist in the audio stream management service of the audio host device. It can be understood that the local audio stream environment attributes of the audio host device shown in the above Table 1 are only examples and do not constitute limitations on the embodiments of the present application.
  • the duration of the one or more audio streams can be stored in the audio focus management stack of the audio host device.
  • the one or more audio streams have their own applications.
  • the application to which the audio stream belongs can add the duration parameter of the audio stream in the audio focus management stack to the audio focus management stack through the audio focus application interface.
  • each element in the stack represents an application and the resources it occupies. Then, the duration parameter of an audio stream can be saved in the data structure of the corresponding element of the application to which the audio stream belongs in the audio focus management stack.
  • the above-mentioned audio host device can obtain the duration of the one or more audio streams from the audio focus management stack based on the request message of the above-mentioned local audio stream attributes.
  • the number of the one or more audio streams and the type of the one or more audio streams can also be obtained from the audio focus management stack.
  • the local audio stream attribute information can be sent to the above-mentioned audio source device (see S501B in Figure 6).
  • the audio sink device can carry the local audio stream attribute information through a local audio stream attribute read response message (SSAP_READ_RSP message) to send it to the audio source device.
  • SSAP_READ_RSP message local audio stream attribute read response message
  • steps S501A and S501B may be interactive steps in the audio attribute acquisition process shown in FIG. 4 .
  • the audio source device After receiving the message from the audio sink device, the audio source device obtains the duration of one or more audio streams stored locally in the audio sink device. Then, the audio source device may determine the first type based on the duration of the one or more audio streams. Exemplarily, the audio source device may calculate a threshold (referred to as the second duration) for determining the first type based on the duration of the one or more audio streams. Exemplarily, the second duration may be, for example, an average value or a weighted average value of the durations of the one or more audio streams, or may be a maximum value among the durations of the one or more audio streams, etc. After obtaining the second duration, the audio source device may compare the first duration of the first audio stream with the second duration.
  • a threshold referred to as the second duration
  • the first type is a short audio duration type. If the first duration is greater than the second duration, then the first type is a long audio duration type. If the first duration is equal to the second duration, then the first type may be a short audio duration type or a long audio duration type.
  • the attribute information of the first audio stream includes the priority of the first audio stream. If the priority is determined based on the duration of the audio stream, then the determination of the priority of the first audio stream can refer to the determination of the first type. Exemplarily, if the first duration of the first audio stream is less than the second duration, then the priority of the first audio stream is a low priority. If the first duration of the first audio stream is equal to the second duration, then the priority of the first audio stream is a high priority. If the first duration of the first audio stream is equal to the second duration, then the priority of the first audio stream can be a low priority or a high priority.
  • An embodiment of the present application determines the type or priority of the duration of the audio stream to be transmitted to the music sink device in the sound source device based on the local audio stream attribute information in the music sink device, so that the music sink device can more reasonably determine the audio focus of the audio stream based on the type or priority of the duration, so as to achieve reasonable allocation of audio playback resources.
  • the audio stream transmission request may be an audio stream transmission message (SSAP_CALL_METHOD_REQ message).
  • an audio occupancy field may be added to the audio stream transmission message. Then, the attribute information of the first audio stream is carried in the audio occupancy field.
  • Table 2 for example.
  • Table 2 shows an example of the description of the audio stream transmission message.
  • the audio stream transmission message may carry an operation code 0x03 indicating that the message is an audio stream transmission message.
  • the operation code may occupy a length of 1 byte.
  • the audio stream transmission message also includes several parameters (or fields) such as the number of audio host access points, the audio host access point identifier, and the audio occupancy. Among them, the number of audio host access points field occupies a length of 1 byte and is used to describe the number of audio access points to be operated.
  • the audio host access point identifier field occupies a length of 1 byte and is used to describe the audio access point identifier to be operated.
  • the audio occupancy field occupies a length of 4 bytes and is used to describe the type of audio stream duration and/or the audio stream duration.
  • the type of audio stream duration in the audio occupancy field occupies a length of 1 byte.
  • 0x00 can be used to indicate that the type of audio stream duration (or occupancy type) is not specified
  • 0x01 can be used to indicate that the type of audio stream duration is a short audio duration type (or a short-term occupancy type)
  • 0x02 can be used to indicate that the type of audio stream duration is a long audio duration type (or a long-term occupancy type).
  • the duration of the audio stream in the audio occupation field occupies 3 bytes.
  • 0x000000 can be used to indicate that the duration of the audio stream is not specified, and then a specific value is used to indicate the duration of the specific audio stream.
  • the embodiment of the present application does not limit the size of the duration of the audio stream.
  • the audio occupation field can also include a reserved bit, which is not currently defined.
  • the audio occupancy field may also be used to carry the priority of the audio stream.
  • the specific representation method may refer to the representation method of the type of the duration of the audio stream, which will not be described in detail here.
  • the above-mentioned audio stream configuration request may be an audio stream configuration message (SSAP_CALL_METHOD_REQ message).
  • an audio occupancy field may be added to the audio stream configuration message. Then, the attribute information of the first audio stream is carried in the audio occupancy field.
  • Table 3 For ease of understanding, please refer to Table 3 for example.
  • Table 3 shows an example of the relevant description of the audio stream configuration message.
  • the audio stream configuration message may carry the operation code 0x01 indicating that the message is an audio stream configuration message.
  • the operation code may occupy a length of 1 byte.
  • the audio stream configuration message also includes the number of audio host access points, the audio host access point identifier, and the audio occupancy parameters (or fields), and the specific description thereof may refer to the relevant description of Table 2 above, which will not be repeated here.
  • the audio stream configuration message also includes the codec identifier, codec parameters, transmission mode, transparent transmission mode, service data unit (SDU) cycle, audio stream type, and port number. The specific description of these fields may refer to Table 3 above, which will not be repeated here.
  • audio stream transmission message and audio stream configuration message can both be represented by SSAP_CALL_METHOD_REQ.
  • the difference is that different messages can be distinguished by operation codes in the SSAP_CALL_METHOD_REQ message.
  • the attribute information may be added to the audio stream transmission request (for example, added to the audio occupancy field of the audio stream transmission message), and sent to the audio sink device.
  • the attribute information may be added to the audio stream configuration request (for example, added to the audio occupancy field of the audio stream configuration message), and sent to the audio sink device.
  • the audio sink device receives attribute information of a first audio stream from the audio source device.
  • the audio sink device receives the audio stream transmission request or the audio stream configuration request from the audio source device, and obtains the attribute information of the first audio stream from the request, for example, the attribute information of the first audio stream may be obtained from the audio occupancy field of the audio stream transmission message or the audio stream configuration message.
  • the audio sink device determines an audio focus type of the first audio stream based on the attribute information, where the audio focus type is a long focus type or a short focus type.
  • the attribute information of the first audio stream obtained by the above-mentioned audio sink device includes the above-mentioned first type.
  • the audio sink device can first determine whether the first type is reasonable based on the first duration.
  • the audio sink device can compare the first duration with the threshold value (i.e., the above-mentioned second duration) used to determine the type of the audio stream duration.
  • the audio sink device can obtain the duration of one or more audio streams stored in the audio sink device from the above-mentioned audio focus management stack or the local audio stream environment attribute stored corresponding to the above-mentioned audio stream management service. Then, in the same way as the above-mentioned sound source device obtains the second duration, the second duration is obtained by taking the average, weighted average, or the maximum value of the duration of the one or more audio streams.
  • the first duration obtained is less than or equal to the second duration, and the first type obtained is a short audio duration type, then the first type is reasonable.
  • the first duration is greater than or equal to the second duration, and the first type obtained is a long audio duration type, then the first type is reasonable.
  • the first duration is less than or equal to the second duration, and the first type obtained is a long audio duration type, then the first type is unreasonable.
  • the first duration is greater than or equal to the second duration, and the first type obtained is a short audio duration type, then the first type is unreasonable.
  • the audio host device can determine the audio focus type of the first audio stream based on the first type. Specifically, if the first type is a short audio duration type, then the audio focus type of the first audio stream can be determined to be a short focus type. If the first type is a long audio duration type, then the audio focus type of the first audio stream can be determined to be a long focus type.
  • the audio sink device may reject the audio stream transmission request or audio stream configuration request of the audio source device. And send a response message (for example, an SSAP_CALL_METHOD_RSP message) of the audio stream transmission request or the audio stream configuration request to the audio source device.
  • the response message of the audio stream transmission request or the audio stream configuration request may carry a result code, and the result code is used to indicate that the first type included in the audio stream transmission request or the audio stream configuration request is unreasonable.
  • the response message of the audio stream transmission request may also carry an operation code 0x03, the number of audio access points, and the audio access point identifier.
  • the meanings of these items of information can refer to the description related to the above Table 2, which will not be repeated here.
  • the response message of the audio stream configuration request may also carry an operation code 0x01, the number of audio access points, and the audio access point identifier, etc.
  • the meanings of these items of information can refer to the description related to the above Table 3, which will not be repeated here.
  • the audio host device may not adopt the first type specified in the audio stream transmission request or the audio stream configuration request.
  • the type of the duration of the first audio stream is re-determined based on the first duration included in the audio stream transmission request or the audio stream configuration request. Specifically, if the first duration is less than the second duration, the type of the duration of the first audio stream is determined to be a short audio duration type. If the first duration is greater than the second duration, the type of the duration of the first audio stream is determined to be a long audio duration type. If the first duration is equal to the second duration, the type of the duration of the first audio stream is determined to be a short audio duration type or a long audio duration type. Then, based on the type of the duration of the re-determined first audio stream, the audio focus type of the first audio stream is determined. For specific implementation, see the above description, which will not be repeated here.
  • the audio host device may not re-determine the type of the duration of the first audio stream, but determine the audio focus type of the first audio stream based on the first duration included in the audio stream transmission request or the audio stream configuration request. Specifically, the audio host device may compare the first duration with a preset threshold or with the second duration. If the first duration is less than the preset threshold (or the second duration), the audio focus type of the first audio stream is determined to be a short focus type. If the first duration is greater than the preset threshold (or the second duration), the audio focus type of the first audio stream is determined to be a long focus type.
  • the audio focus type of the first audio stream is determined to be a short focus type or a long focus type. It can be understood that the preset threshold can be set according to the actual implementation, and the implementation mode of the present application is not limited.
  • the audio sink device can select a reasonable audio focus type based on the duration of the audio stream indicated by the sound source device, thereby quickly applying for an available audio focus for the audio stream, playing the audio as quickly as possible, and improving the user experience.
  • the attribute information of the first audio stream obtained by the above-mentioned audio sink device includes the above-mentioned first type, but the attribute information does not include the above-mentioned first duration.
  • the value of the audio stream duration in the audio occupancy field is 0x000000, which means that the duration of the above-mentioned first audio stream is not specified.
  • the audio sink device may not make a judgment on whether the first type is reasonable. Instead, the audio focus type of the above-mentioned first audio stream is determined based on the first type. For the specific implementation, please refer to the above description, which will not be repeated here.
  • the attribute information of the first audio stream obtained by the above-mentioned audio sink device includes the above-mentioned first duration, but does not include the above-mentioned first type.
  • the audio sink device can determine the audio focus type of the first audio stream based on the first duration. Specifically, the audio sink device can compare the first duration with a preset threshold or with the above-mentioned second duration. If the first duration is less than the preset threshold (or the second duration), the audio focus type of the first audio stream is determined to be a short focus type. If the first duration is greater than the preset threshold (or the second duration), the audio focus type of the first audio stream is determined to be a long focus type.
  • the audio focus type of the first audio stream is determined to be a short focus type or a long focus type. It can be understood that the preset threshold can be set according to the actual implementation, and the implementation mode of the present application is not limited.
  • the attribute information of the first audio stream obtained by the above-mentioned audio host device includes the priority of the above-mentioned first audio stream.
  • the audio host device can determine the audio focus type of the first audio stream based on the priority of the first audio stream. Specifically, if the priority of the first audio stream is high priority, the audio focus type of the first audio stream is determined to be a long focus type. If the priority of the first audio stream is low priority, the audio focus type of the first audio stream is determined to be a short focus type.
  • the audio sink device determines the audio focus type of the audio stream based on the audio stream attribute information from the audio source device and/or the local audio stream environment attributes of the audio sink device, so that the determined audio focus type matches the audio duration of the audio stream or is consistent with the operation status of the audio sink device, thereby achieving reasonable allocation of audio playback resources and improving user experience.
  • the audio source device and/or the audio sink device may determine the storage space allocated to the first audio stream and/or determine the processing priority of the first audio stream based on the attribute information of the first audio stream.
  • the sound source device before the sound sink device sends the first audio stream, it must first cache the first audio stream in a sending queue and wait for sending. Then, the sound source device can allocate reasonable cache space for the first audio stream based on the duration of the first audio stream (i.e., the first duration mentioned above).
  • the sound source device can also determine the sending order of the first audio stream based on the first duration. Exemplarily, if the first duration is less than a threshold (referred to as threshold 3), indicating that the number of audio frames that need to be buffered for the first audio stream is small, the first audio stream can be sent first. If the first duration is greater than or equal to the threshold 3, indicating that the number of audio frames that need to be buffered for the first audio stream is large, if it is sent first, it will affect the sending of other business information. In this case, the first audio stream can be sent normally according to the queuing order of the sending queue. It can be understood that the threshold 3 can be set according to the actual implementation, and the implementation mode of the present application is not limited.
  • the buffer allocated to the audio stream in the existing implementation is generally a fixed empirical value, and the value generally has a large margin.
  • the audio host device can know the first duration of the above-mentioned first audio stream, so it can allocate a buffer for the first audio stream according to the actual duration to save storage resources. If the first duration is less than a threshold (referred to as threshold 4), the audio host device can allocate a storage space for the first audio stream that can store the audio stream of the first duration.
  • threshold 4 can be set according to the actual implementation, and the implementation method of this application is not limited.
  • the audio sink device may allocate a smaller storage space (referred to as the first storage space) for the first audio stream.
  • the duration of the audio stream that can be stored in the first storage space is less than the first duration, but the first storage space can be used to cyclically store the unused audio stream in the first audio stream.
  • the first audio stream is gradually sent to the audio sink device in the form of a stream, and the audio sink device can receive the first audio stream while storing the received part of the audio stream in the first storage space, while obtaining the audio stream from the first storage space and using it (for example, playing it).
  • the audio stream that has been used can be released from the first storage space to make room for storing the newly received part of the audio stream.
  • the above-mentioned audio sink device can learn the first type of the first duration of the above-mentioned first audio stream, and thus can allocate a buffer for the first audio stream according to the first type.
  • the audio sink device can allocate a storage space for the first audio stream that can store the audio stream of the first duration.
  • the audio sink device can allocate a smaller storage space for the first audio stream.
  • the above-mentioned audio host device can also determine whether to give priority to (for example, play) the above-mentioned first audio stream based on the above-mentioned first duration or the above-mentioned first type. For example, if the above-mentioned first duration is less than the threshold value 5 or the above-mentioned first type is a short audio duration type, it can be determined that the first audio stream is given priority (for example, played). On the contrary, if the above-mentioned first duration is greater than or equal to the threshold value 5 or the above-mentioned first type is a long audio duration type, the first audio stream can be used (for example, played) after the current audio stream is used. It can be understood that the above-mentioned threshold value 5 can be set according to the actual implementation, and the implementation mode of this application is not limited.
  • the present application embodiment provides another audio control method, which can be exemplified in FIG7.
  • the method includes but is not limited to the following steps:
  • the audio sink device allocates an audio focus to the first audio stream in the form of a short focus.
  • the music sink device may allocate the audio focus to the first audio stream in the form of a short focus after receiving the audio stream transmission request of the first audio stream from the sound source device.
  • the above-mentioned sound source device sends the audio stream transmission request to the music sink device, and the music sink device may receive the audio stream transmission request.
  • the music sink device may know that the sound source device is about to send an audio stream (i.e., the above-mentioned first audio stream) to the music sink device.
  • the application to which the first audio stream belongs in the music sink device may apply for an audio focus for the first audio stream.
  • an audio focus of a short focus type may be applied for the first audio stream.
  • the music sink device may allocate the audio focus to the application to which the first audio stream belongs in the form of a short focus. Then, after receiving the first audio stream, the music sink device may use (e.g., play) the first audio stream based on the audio focus of the short focus type.
  • the attribute information of the first audio stream is not indicated in the audio stream transmission request.
  • the first duration of the first audio stream and the first type of the duration of the first audio stream are not indicated in the audio stream transmission request, nor is the priority of the first audio stream indicated.
  • the introduction of the audio stream transmission request can refer to the corresponding description in S501 in Figure 5 above, which is not repeated here.
  • the music sink device may actively send an audio stream transmission change event of the above-mentioned first audio stream to the sound source device to request the sound source device to send the first audio stream.
  • the music sink device may allocate the audio focus to the first audio stream in the form of a short focus before sending the audio stream transmission change event to the sound source device.
  • the application to which the first audio stream belongs in the music sink device may apply for an audio focus for the first audio stream.
  • an audio focus of a short focus type may be applied for the first audio stream.
  • the music sink device may allocate the audio focus to the application to which the first audio stream belongs in the form of a short focus. Then, after receiving the first audio stream, the music sink device may use (for example, play) the first audio stream based on the audio focus of the short focus type.
  • the audio sink device can actively send an audio stream transmission change event of the first audio stream to the audio source device.
  • the audio sink device determines that the duration for which the first audio stream occupies the audio focus is greater than a threshold.
  • the above-mentioned audio host device receives a first audio stream from a sound source device. And the first audio stream is used (for example, played) based on the audio focus of the short focus type of the above-mentioned application. After the audio host device plays the above-mentioned first audio stream based on the audio focus of the short focus type, it can detect the duration of the first audio stream (or the application to which the first audio stream belongs) occupying the audio focus. For example, a timer can be set, and when the duration of the timer reaches a preset duration threshold, it can be determined that the duration of the first audio stream occupying the audio focus is greater than the threshold.
  • the audio sink device changes the audio focus type of the first audio stream from short focus to long focus.
  • the audio sink device After the audio sink device determines that the duration of the first audio stream occupying the audio focus is greater than the threshold, it indicates that the duration of the first audio stream is long, and the audio focus of the short focus type applied for it is unreasonable. In this case, the audio sink device can change the audio focus type of the first audio stream from short focus to long focus, so that the first audio stream can continue to be used (for example, played).
  • the audio host device when the above-mentioned audio host device allocates the audio focus of the short focus type to the application to which the first audio stream belongs, it may not release the information of other applications in the audio focus management stack, but add the information of the application to which the first audio stream belongs to the audio focus management stack to become the top element of the stack. That is, the application to which the first audio stream belongs obtains the audio focus.
  • the audio host device can remove the elements in the audio focus management stack, and the application to which the first audio stream belongs is pushed into the stack to obtain the audio focus, and other applications permanently lose the audio focus.
  • the above-mentioned audio host device uses (for example, plays) the first audio stream based on the audio focus of the above-mentioned short focus type
  • the duration that the first audio stream occupies the audio focus is still less than the above-mentioned preset duration threshold until the first audio stream is used (for example, played) to completion, the above-mentioned audio focus type change operation will not be triggered.
  • the audio focus management stack of the music host device includes information of application 1, application 2 and application 3.
  • the duration of the audio streams of application 1, application 2 and application 3 are duration 1, duration 2 and duration 3 respectively.
  • the current top element of the stack is application 3, that is, application 3 occupies the audio focus.
  • application 4 first applies for an audio focus of the short focus type. Then, the top element of the stack, application 3, loses the audio focus, and application 4 is pushed into the stack to occupy the audio focus, and the audio focus is a short focus.
  • the music host device can play the first audio stream based on the audio focus. From the time when application 4 occupies the audio focus, a timer can be set. When the duration of the timer reaches the preset duration threshold, it can be determined that the duration of application 4 (or the first audio stream) occupying the audio focus is greater than the threshold. In this case, application 4 can apply for a long focus instead. In this case, all elements in the audio focus management stack (application 1, application 2, and application 3) are popped out of the stack, and application 4 is pushed into the stack to occupy the audio focus. Application 1, application 2, and application 3 permanently lose focus.
  • the audio host device dynamically applies for a corresponding type of audio focus for the audio stream based on the duration of the audio stream occupying the audio focus, so that the type of audio focus applied for matches the audio duration of the audio stream, thereby achieving reasonable allocation of audio playback resources and improving user experience.
  • each control unit or device includes a hardware structure and/or software module corresponding to the execution of each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is executed in the form of hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
  • the embodiment of the present application can divide the functional modules of the device according to the above method example.
  • each functional module can be divided according to each function, or two or more functions can be integrated into one module.
  • the above integrated module can be implemented in the form of hardware or in the form of software functional modules. It should be noted that the division of modules in the embodiment of the present application is schematic and is only a logical function division. There may be other division methods in actual implementation.
  • the embodiment of the present application also provides a device for implementing any of the above methods, for example, a device is provided including units (or means) for implementing each step in any of the above methods.
  • FIG8 is a schematic diagram of the structure of a music sink device 800 provided in an embodiment of the present application.
  • the music sink device 800 may include a receiving unit 801 and a processing unit 802. Among them:
  • the receiving unit 801 is used to receive the attribute information of the first audio stream from the audio source device.
  • the receiving unit 801 can be used to implement the receiving operation of S502 in FIG. 5 , for example.
  • the processing unit 802 is configured to determine the audio focus type of the first audio stream based on the attribute information, where the audio focus type is a long focus type or a short focus type.
  • the processing unit 802 can be used to implement the audio focus type determination operation in S503 in FIG. 5 , for example.
  • the attribute information indicates one or more of the following: a first duration of the first audio stream, a first type of the first duration, or a priority of the first audio stream.
  • the attribute information indicates a first type of a first duration of the first audio stream
  • the music sink device 800 further includes a sending unit, configured to send first information to the sound source device before the receiving unit 801 receives the attribute information of the first audio stream from the sound source device; the first information includes the duration of one or more audio streams stored in the music sink device 800, and the duration of the one or more audio streams is used to determine the first type.
  • the sending unit is specifically configured to send a local audio stream attribute reading response message to the audio source device, wherein the local audio stream attribute reading response message includes the first information.
  • the duration of the one or more audio streams is stored in an audio focus management stack of the audio sink device 800 .
  • the attribute information indicates a first type of a first duration of the first audio stream; when the first type is a short audio duration type, the audio focus type of the first audio stream is a short focus type; or, when the first type is a long audio duration type, the audio focus type of the first audio stream is a long focus type.
  • the attribute information indicates a first duration of the first audio stream; when the first duration is less than a first threshold, the audio focus type of the first audio stream is a short focus type; or, when the first duration is greater than the first threshold, the audio focus type of the first audio stream is a short focus type.
  • the receiving unit 801 is specifically configured to:
  • the property information of the first audio stream is received from the audio source device during the audio stream configuration or reconfiguration phase; or, the property information of the first audio stream is received from the audio source device during the audio stream transmission phase.
  • the processing unit 802 is further configured to determine a storage space allocated to the first audio stream and/or a processing priority of the first audio stream based on the attribute information.
  • the processing unit 802 is specifically used to: when the attribute information indicates that the first duration of the first audio stream is greater than a second threshold, allocate a first storage space to the first audio stream; the duration of the audio stream that can be stored in the first storage space is less than the first duration, and the first storage space is used to cyclically store unused audio streams in the first audio stream.
  • each unit in the audio sink device 800 shown in FIG. 8 can be found in the corresponding descriptions in the above-mentioned FIG. 5 and its possible embodiments, which will not be repeated here.
  • FIG. 9 is a schematic diagram of the structure of a music sink device 900 provided in an embodiment of the present application.
  • the music sink device 900 may include a processing unit 901. Among them:
  • the processing unit 901 is configured to allocate the audio focus to the first audio stream in the form of a short focus based on the audio stream transmission request; determine that the duration of the first audio stream occupying the audio focus is greater than a threshold; and change the audio focus type of the first audio stream from the short focus to the long focus.
  • the processing unit 901 can be used to implement the operations in S701 to S703 in FIG. 7 , for example.
  • the audio stream transmission request does not indicate the attribute information of the first audio stream.
  • the audio sink device further includes a receiving unit for receiving an audio stream transmission request for the first audio stream from the audio source device before the processing unit assigns the audio focus to the first audio stream in the form of a short focus; the audio stream transmission request does not indicate attribute information of the first audio stream.
  • the audio sink device further includes a sending unit, configured to send an audio stream transmission change event of the first audio stream to the audio source device after the processing unit assigns the audio focus to the first audio stream in the form of a short focus.
  • each unit in the audio sink device 900 shown in FIG. 9 can be found in the corresponding descriptions in FIG. 7 and its possible embodiments, and will not be repeated here.
  • FIG. 10 is a schematic diagram of the structure of a sound source device 1000 provided in an embodiment of the present application.
  • the sound source device 1000 may include a processing unit 1001 and a sending unit 1002. Among them:
  • the processing unit 1001 is configured to obtain attribute information of a first audio stream.
  • the sending unit 1002 is used to send the audio stream transmission request to the audio sink device; the attribute information is used to determine the audio focus type of the first audio stream.
  • the sending unit 1002 can be used to implement the sending operation of S501 in FIG. 5 .
  • the attribute information indicates one or more of the following: a first duration of the first audio stream, a first type of the first duration, or a priority of the first audio stream.
  • the attribute information indicates a first type of a first duration of the first audio stream
  • the audio source device 1000 further includes a receiving unit, configured to receive first information from the audio sink device before the processing unit 1001 obtains the attribute information of the first audio stream; the first information includes the duration of one or more audio streams stored in the audio sink device;
  • the processing unit 1001 is further configured to determine a second duration based on the duration of the one or more audio streams; and determine the first type based on the second duration and the first duration.
  • the first type when the first duration is less than the second duration, the first type is a short audio duration type; or, when the first duration is greater than the second duration, the first type is a long audio duration type.
  • the processing unit 1001 before acquiring the attribute information of the first audio stream, is further used to determine the attribute information of the first audio stream based on the associated information of the application to which the first audio stream belongs; the associated information of the application includes the audio play list, audio play mode or historical play data of the application.
  • the processing unit 1001 is further configured to, after acquiring the attribute information of the first audio stream, determine, based on the attribute information, a storage space allocated to the first audio stream and/or a processing priority of the first audio stream.
  • each unit in the sound source device 1000 shown in FIG. 10 may refer to the corresponding descriptions in the above-mentioned FIG. 5 and its possible embodiments, and will not be repeated here.
  • the music sink device 1100 includes: a processor 1101, a memory 1102, and a communication interface 1103.
  • the processor 1101, the communication interface 1103, and the memory 1102 may be connected to each other or connected to each other through a bus 1104.
  • the memory 1102 is used to store computer programs and data of the music host device 1100.
  • the memory 1102 may include, but is not limited to, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) or portable read-only memory (compact disc read-only memory, CD-ROM), etc.
  • the software or program codes required for all or part of the functions of the music sink device in the above method embodiment may be stored in the memory 1102 .
  • the processor 1101 in addition to calling the program code in the memory 1102 to implement some functions, can also cooperate with other components (such as the communication interface 1103) to jointly complete other functions described in the method embodiment (such as the function of receiving or sending data).
  • the processor 1101 may be a processor, which is a circuit with data processing capability.
  • the processor may be a circuit with instruction reading and running capability, such as a central processing unit (CPU), a microprocessor, a graphics processing unit (GPU) (which may be understood as a microprocessor), or a digital signal processor (DSP).
  • the processor may implement certain functions through the logical relationship of a hardware circuit, and the logical relationship of the hardware circuit may be fixed or reconfigurable, such as a hardware circuit implemented by a processor as an application-specific integrated circuit (ASIC) or a programmable logic device (PLD), such as a field programmable gate array (FPGA).
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • FPGA field programmable gate array
  • the process of the processor loading a configuration document to implement the hardware circuit configuration may be understood as the process of the processor loading instructions to implement the functions of some or all of the above units.
  • it can also be a hardware circuit designed for artificial intelligence, which can be understood as an ASIC, such as a neural network processing unit (NPU), a tensor processing unit (TPU), a deep learning processing unit (DPU), etc.
  • the processor 1101 can be a combination of at least two of these processor forms, etc.
  • the processor 1101 may be configured to read the program stored in the memory 1102 and execute the operations performed by the audio sink device in FIG. 5 or FIG. 7 and possible embodiments thereof.
  • each unit in the audio sink device 1100 shown in FIG. 11 may refer to the corresponding descriptions in the above-mentioned FIG. 5 or FIG. 7 and their possible embodiments, and will not be repeated here.
  • the sound source device 1200 includes: a processor 1201, a memory 1202, and a communication interface 1203.
  • the processor 1201, the communication interface 1203, and the memory 1202 may be interconnected or interconnected via a bus 1204.
  • the memory 1202 is used to store computer programs and data of the sound source device 1200.
  • the memory 1202 may include, but is not limited to, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) or portable read-only memory (compact disc read-only memory, CD-ROM), etc.
  • the software or program codes required for all or part of the functions of the sound source device in the above method embodiment may be stored in the memory 1202 .
  • the processor 1201 in addition to calling the program code in the memory 1202 to implement some functions, can also cooperate with other components (such as the communication interface 1203) to jointly complete other functions described in the method embodiment (such as the function of receiving or sending data).
  • processor 1201 can be exemplified by referring to the description of the processor 1101 in FIG. 11 above, which will not be repeated here.
  • the processor 1201 may be used to read the program stored in the memory 1202 and execute the operations performed by the sound source device in FIG. 5 or FIG. 7 and possible embodiments thereof.
  • An embodiment of the present application also provides a computer-readable storage medium, which stores a computer program or computer instructions, and the computer program or computer instructions are executed by a processor to implement the method implemented by the music host device in any of the above-mentioned Figures 5 or 7 and their possible implementations.
  • An embodiment of the present application also provides a computer-readable storage medium, which stores a computer program or computer instructions, and the computer program or computer instructions are executed by a processor to implement the method implemented by the sound source device in any of the above-mentioned Figures 5 or 7 and their possible implementations.
  • the embodiment of the present application further provides a computer program product.
  • the computer program product is read and executed by a computer, the method implemented by the music sink device in any of the above-mentioned FIG. 5 or FIG. 7 and possible implementations thereof will be executed.
  • the embodiment of the present application further provides a computer program product.
  • the computer program product is read and executed by a computer, the method implemented by the sound source device in any of the above-mentioned FIG. 5 or FIG. 7 and possible implementations thereof will be executed.
  • the audio sink device determines the audio focus type of the audio stream based on the audio stream attribute information from the audio source device, so that the determined audio focus type matches the audio duration of the audio stream, thereby achieving reasonable allocation of audio service resources (such as playback resources) and improving user experience.
  • first, second, etc. are used to distinguish between identical or similar items with substantially the same effects and functions. It should be understood that there is no logical or temporal dependency between “first”, “second”, and “nth”, nor is there a limitation on the quantity and execution order. It should also be understood that although the following description uses the terms first, second, etc. to describe various elements, these elements should not be limited by the terms. These terms are only used to distinguish one element from another.
  • the size of the serial number of each process does not mean the order of execution.
  • the execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.
  • references to “one embodiment”, “an embodiment”, or “a possible implementation” throughout the specification mean that specific features, structures, or characteristics related to the embodiment or implementation are included in at least one embodiment of the present application. Therefore, the references to “in one embodiment” or “in an embodiment”, or “a possible implementation” throughout the specification do not necessarily refer to the same embodiment. In addition, these specific features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Stereophonic System (AREA)

Abstract

La présente demande concerne un procédé de commande audio et un dispositif associé. Le procédé comprend les étapes suivantes : un dispositif récepteur de son reçoit des informations d'attribut d'un premier flux audio en provenance d'un dispositif source sonore ; et le dispositif récepteur audio détermine un type de mise au point audio du premier flux audio sur la base des informations d'attribut, le type de mise au point audio étant un type de mise au point long ou un type de mise au point courte. La présente demande peut réaliser une attribution raisonnable de ressources de service audio, et améliorer l'expérience de l'utilisateur.
PCT/CN2022/133026 2022-11-18 2022-11-18 Procédé et dispositif de commande audio WO2024103418A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/133026 WO2024103418A1 (fr) 2022-11-18 2022-11-18 Procédé et dispositif de commande audio

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/133026 WO2024103418A1 (fr) 2022-11-18 2022-11-18 Procédé et dispositif de commande audio

Publications (1)

Publication Number Publication Date
WO2024103418A1 true WO2024103418A1 (fr) 2024-05-23

Family

ID=91083544

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/133026 WO2024103418A1 (fr) 2022-11-18 2022-11-18 Procédé et dispositif de commande audio

Country Status (1)

Country Link
WO (1) WO2024103418A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120063618A1 (en) * 2010-09-14 2012-03-15 Yamaha Corporation Speaker device
US9444565B1 (en) * 2015-04-30 2016-09-13 Ninjawav, Llc Wireless audio communications device, system and method
CN106598539A (zh) * 2016-12-15 2017-04-26 广州酷狗计算机科技有限公司 一种应用程序内音频的处理方法和装置
CN109996099A (zh) * 2019-04-16 2019-07-09 百度在线网络技术(北京)有限公司 车载系统的音频焦点控制方法、系统以及车载系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120063618A1 (en) * 2010-09-14 2012-03-15 Yamaha Corporation Speaker device
US9444565B1 (en) * 2015-04-30 2016-09-13 Ninjawav, Llc Wireless audio communications device, system and method
CN106598539A (zh) * 2016-12-15 2017-04-26 广州酷狗计算机科技有限公司 一种应用程序内音频的处理方法和装置
CN109996099A (zh) * 2019-04-16 2019-07-09 百度在线网络技术(北京)有限公司 车载系统的音频焦点控制方法、系统以及车载系统

Similar Documents

Publication Publication Date Title
WO2022095795A1 (fr) Procédé et appareil de communication, support lisible par ordinateur, et dispositif électronique
US9654406B2 (en) Communication traffic processing architectures and methods
CN108616458B (zh) 客户端设备上调度分组传输的系统和方法
EP2997698B1 (fr) Système et procédé de mappage d'une topologie de niveau de service vers une topologie logique plane de données spécifique à un service
JP7218447B2 (ja) ポリシー制御方法、装置及びシステム
TWI430102B (zh) 網路卡資源配置方法、儲存媒體、及電腦
EP1826962B1 (fr) Gestionnaire des ressources de commutateur global
WO2019228344A1 (fr) Procédé et appareil de reconfiguration de ressources, terminal et support de stockage
KR100872178B1 (ko) 우선순위 기반의 무선 usb 전송 서비스 관리 장치 및방법
WO2014026613A1 (fr) Procédé et terminal de distribution de largeur de bande de réseau
JP3566218B2 (ja) Bluetoothネットワーク通信方法およびシステム
WO2019157849A1 (fr) Procédé et appareil de planification de ressources, dispositif et système
US11347567B2 (en) Methods and apparatus for multiplexing data flows via a single data structure
WO2024037296A1 (fr) Procédé et dispositif de transmission de données quic basés sur une famille de protocoles
EP2838243A1 (fr) Procédé et système de présentation et d'agrégation de capacités
WO2019019032A1 (fr) Procédé et appareil pour configurer un paquet de données de liaison descendante
JP2021518955A (ja) プロセッサコアのスケジューリング方法、装置、端末及び記憶媒体
EP3961999A1 (fr) Procédé permettant de déterminer une exigence de transmission de service, appareil, et système
WO2019062995A1 (fr) Procédé, dispositif et système de gestion de réseau
WO2023092415A1 (fr) Procédé et appareil de traitement de message
WO2021174536A1 (fr) Procédé de communication et appareil associé
US20240256351A1 (en) Methods for generating application for radio-access-network servers with heterogeneous accelerators
US20210320875A1 (en) Switch-based adaptive transformation for edge appliances
WO2024103418A1 (fr) Procédé et dispositif de commande audio
WO2023065853A1 (fr) Procédé et appareil de transmission de donnés, dispositif, support de stockage et programme

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22965597

Country of ref document: EP

Kind code of ref document: A1