CN112887404A

CN112887404A - Audio transmission control method and device and computer readable storage medium

Info

Publication number: CN112887404A
Application number: CN202110103836.4A
Authority: CN
Inventors: 廖松茂
Original assignee: Nubia Technology Co Ltd
Current assignee: Nubia Technology Co Ltd
Priority date: 2021-01-26
Filing date: 2021-01-26
Publication date: 2021-06-01
Anticipated expiration: 2041-01-26
Also published as: CN112887404B

Abstract

The invention discloses an audio transmission control method, audio transmission control equipment and a computer readable storage medium, wherein the method comprises the following steps: creating an aggregate queue for recording the received audio data; then, audio data to be played are obtained in the set queue; when the audio data to be played are extracted, rejecting the extracted audio data in the set queue; and when the time delay of audio transmission in the current state obtained according to the real-time capacity of the aggregate queue exceeds a preset value, reducing the receiving quantity of the audio data in the aggregate queue so as to slow down the time delay of the audio transmission. The method and the device realize a humanized audio transmission control scheme, can timely find the possible time delay in the screen recording or screen projection process, accurately judge the degree of the time delay, provide an imperceptible time delay judgment and processing scheme, and enhance the user experience.

Description

Audio transmission control method and device and computer readable storage medium

Technical Field

The present invention relates to the field of mobile communications, and in particular, to an audio transmission control method, device, and computer-readable storage medium.

Background

In the prior art, with the continuous development of intelligent terminal devices, the use frequency of a user for screen projection or screen recording of the device is also higher and higher, but based on the existing screen projection or screen recording scheme, a certain delay defect may exist for the control of audio transmission, specifically, in the screen projection or screen recording process, the rhythm of audio playing and the transmission rhythm of original audio may be asynchronous, and along with the continuous increase of the screen projection or screen recording duration, the degree of the delay may be further accumulated. That is, when the user just starts to project or record the screen, the probability of the generated time delay sensed by the user is low, if the user projects or records the screen for a long time, the accumulated time delay becomes more and more obvious, and the occurrence of the audio time delay brings certain trouble to the normal screen projection or recording of the user, and a screen projection or recording file with synchronous video and audio cannot be generated, so that the user experience is reduced to a certain extent.

Disclosure of Invention

In order to solve the technical defects in the prior art, the invention provides an audio transmission control method, which comprises the following steps:

creating an aggregate queue for recording the received audio data;

acquiring audio data to be played in the set queue;

when the audio data to be played are extracted, rejecting the extracted audio data in the set queue;

and when the time delay of audio transmission in the current state obtained according to the real-time capacity of the aggregate queue exceeds a preset value, reducing the receiving quantity of the audio data in the aggregate queue so as to slow down the time delay of the audio transmission.

Optionally, the creating an aggregate queue for recording the received audio data includes:

generating a first thread for receiving the audio data transmitted by the network and decoding the audio data;

creating, by the first thread, the aggregate queue through which the audio data received in the current state, decoded, and not yet played, is recorded.

Optionally, the obtaining of the audio data to be played in the aggregate queue includes:

generating a second thread for monitoring the playing state of the audio data;

and acquiring the audio data to be played in the set queue through the second thread.

Optionally, the rejecting the extracted audio data in the aggregate queue when the audio data to be played is extracted includes:

extracting the audio data to be played in the set queue through the second thread;

and eliminating the audio data to be played in the set queue through the first thread.

Optionally, when the time delay of audio transmission in the current state obtained according to the real-time capacity of the aggregate queue exceeds a preset value, reducing the receiving amount of the audio data in the aggregate queue to slow down the time delay of the audio transmission, including:

presetting a preset capacity value corresponding to the set queue for monitoring the time delay;

and acquiring the real-time capacity of the set queue, and determining the size relationship between the real-time capacity and the preset capacity value.

Optionally, when the time delay of audio transmission in the current state obtained according to the real-time capacity of the aggregate queue exceeds a preset value, reducing the receiving amount of the audio data in the aggregate queue to slow down the time delay of the audio transmission, further comprising:

if the real-time capacity is smaller than the preset capacity value, determining that the time delay does not exceed the preset value;

and if the real-time capacity is larger than the preset capacity value, determining that the time delay exceeds the preset value.

presetting a frame skipping condition for processing the received audio data;

and when the time delay is determined to exceed the preset value, reducing the subsequent receiving quantity of the audio data in the set queue according to the frame skipping condition so as to slow down the time delay of the audio transmission.

presetting an amplitude condition for extracting the audio data according to the amplitude;

and when the time delay is determined to exceed the preset value, detecting and deleting the audio data which meets the amplitude condition in the received audio data so as to reduce the subsequent receiving quantity of the audio data in the aggregation queue and slow down the time delay of the audio transmission.

The invention also proposes an audio transmission control device comprising a memory, a processor and a computer program stored on said memory and executable on said processor, said computer program, when executed by said processor, implementing the steps of the audio transmission control method as defined in any one of the above.

The present invention also proposes a computer-readable storage medium having stored thereon an audio transmission control program which, when executed by a processor, implements the steps of the audio transmission control method according to any one of the above.

An audio transmission control method, apparatus, and computer-readable storage medium embodying the present invention are achieved by creating an aggregate queue for recording received audio data; then, audio data to be played are obtained in the set queue; when the audio data to be played are extracted, rejecting the extracted audio data in the set queue; and when the time delay of audio transmission in the current state obtained according to the real-time capacity of the aggregate queue exceeds a preset value, reducing the receiving quantity of the audio data in the aggregate queue so as to slow down the time delay of the audio transmission. The method and the device realize a humanized audio transmission control scheme, can timely find the possible time delay in the screen recording or screen projection process, accurately judge the degree of the time delay, provide an imperceptible time delay judgment and processing scheme, and enhance the user experience.

Drawings

The invention will be further described with reference to the accompanying drawings and examples, in which:

fig. 1 is a schematic diagram of a hardware structure of a mobile terminal according to the present invention;

fig. 2 is a communication network system architecture diagram provided by an embodiment of the present invention;

fig. 3 is a flowchart of a first embodiment of an audio transmission control method of the present invention;

fig. 4 is a flowchart of a second embodiment of an audio transmission control method of the present invention;

fig. 5 is a flowchart of a third embodiment of an audio transmission control method of the present invention;

fig. 6 is a flowchart of a fourth embodiment of an audio transmission control method of the present invention;

fig. 7 is a flowchart of a fifth embodiment of an audio transmission control method of the present invention;

fig. 8 is a flowchart of a sixth embodiment of an audio transmission control method of the present invention;

fig. 9 is a flowchart of a seventh embodiment of an audio transmission control method of the present invention;

fig. 10 is a flowchart of an eighth embodiment of an audio transmission control method of the present invention.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In the following description, suffixes such as "module", "component", or "unit" used to denote elements are used only for facilitating the explanation of the present invention, and have no specific meaning in itself. Thus, "module", "component" or "unit" may be used mixedly.

The terminal may be implemented in various forms. For example, the terminal described in the present invention may include a mobile terminal such as a mobile phone, a tablet computer, a notebook computer, a palmtop computer, a Personal Digital Assistant (PDA), a Portable Media Player (PMP), a navigation device, a wearable device, a smart band, a pedometer, and the like, and a fixed terminal such as a Digital TV, a desktop computer, and the like.

The following description will be given by way of example of a mobile terminal, and it will be understood by those skilled in the art that the construction according to the embodiment of the present invention can be applied to a fixed type terminal, in addition to elements particularly used for mobile purposes.

Referring to fig. 1, which is a schematic diagram of a hardware structure of a mobile terminal for implementing various embodiments of the present invention, the mobile terminal 100 may include: RF (Radio Frequency) unit 101, WiFi module 102, audio output unit 103, a/V (audio/video) input unit 104, sensor 105, display unit 106, user input unit 107, interface unit 108, memory 109, processor 110, and power supply 111. Those skilled in the art will appreciate that the mobile terminal architecture shown in fig. 1 is not intended to be limiting of mobile terminals, which may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The following describes each component of the mobile terminal in detail with reference to fig. 1:

the radio frequency unit 101 may be configured to receive and transmit signals during information transmission and reception or during a call, and specifically, receive downlink information of a base station and then process the downlink information to the processor 110; in addition, the uplink data is transmitted to the base station. Typically, radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 101 can also communicate with a network and other devices through wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System for Mobile communications), GPRS (General Packet Radio Service), CDMA2000(Code Division Multiple Access 2000), WCDMA (Wideband Code Division Multiple Access), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access), FDD-LTE (Frequency Division duplex Long Term Evolution), and TDD-LTE (Time Division duplex Long Term Evolution).

WiFi belongs to short-distance wireless transmission technology, and the mobile terminal can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 102, and provides wireless broadband internet access for the user. Although fig. 1 shows the WiFi module 102, it is understood that it does not belong to the essential constitution of the mobile terminal, and may be omitted entirely as needed within the scope not changing the essence of the invention.

The audio output unit 103 may convert audio data received by the radio frequency unit 101 or the WiFi module 102 or stored in the memory 109 into an audio signal and output as sound when the mobile terminal 100 is in a call signal reception mode, a call mode, a recording mode, a voice recognition mode, a broadcast reception mode, or the like. Also, the audio output unit 103 may also provide audio output related to a specific function performed by the mobile terminal 100 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 103 may include a speaker, a buzzer, and the like.

The a/V input unit 104 is used to receive audio or video signals. The a/V input Unit 104 may include a Graphics Processing Unit (GPU) 1041 and a microphone 1042, the Graphics processor 1041 Processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 106. The image frames processed by the graphic processor 1041 may be stored in the memory 109 (or other storage medium) or transmitted via the radio frequency unit 101 or the WiFi module 102. The microphone 1042 may receive sounds (audio data) via the microphone 1042 in a phone call mode, a recording mode, a voice recognition mode, or the like, and may be capable of processing such sounds into audio data. The processed audio (voice) data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 101 in case of a phone call mode. The microphone 1042 may implement various types of noise cancellation (or suppression) algorithms to cancel (or suppress) noise or interference generated in the course of receiving and transmitting audio signals.

The mobile terminal 100 also includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display panel 1061 according to the brightness of ambient light, and a proximity sensor that can turn off the display panel 1061 and/or a backlight when the mobile terminal 100 is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the posture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a fingerprint sensor, a pressure sensor, an iris sensor, a molecular sensor, a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.

The display unit 106 is used to display information input by a user or information provided to the user. The Display unit 106 may include a Display panel 1061, and the Display panel 1061 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.

The user input unit 107 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the mobile terminal. Specifically, the user input unit 107 may include a touch panel 1071 and other input devices 1072. The touch panel 1071, also referred to as a touch screen, may collect a touch operation performed by a user on or near the touch panel 1071 (e.g., an operation performed by the user on or near the touch panel 1071 using a finger, a stylus, or any other suitable object or accessory), and drive a corresponding connection device according to a predetermined program. The touch panel 1071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 110, and can receive and execute commands sent by the processor 110. In addition, the touch panel 1071 may be implemented in various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. In addition to the touch panel 1071, the user input unit 107 may include other input devices 1072. In particular, other input devices 1072 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like, and are not limited to these specific examples.

Further, the touch panel 1071 may cover the display panel 1061, and when the touch panel 1071 detects a touch operation thereon or nearby, the touch panel 1071 transmits the touch operation to the processor 110 to determine the type of the touch event, and then the processor 110 provides a corresponding visual output on the display panel 1061 according to the type of the touch event. Although the touch panel 1071 and the display panel 1061 are shown in fig. 1 as two separate components to implement the input and output functions of the mobile terminal, in some embodiments, the touch panel 1071 and the display panel 1061 may be integrated to implement the input and output functions of the mobile terminal, and is not limited herein.

The interface unit 108 serves as an interface through which at least one external device is connected to the mobile terminal 100. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 108 may be used to receive input (e.g., data information, power, etc.) from external devices and transmit the received input to one or more elements within the mobile terminal 100 or may be used to transmit data between the mobile terminal 100 and external devices.

The memory 109 may be used to store software programs as well as various data. The memory 109 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 109 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The processor 110 is a control center of the mobile terminal, connects various parts of the entire mobile terminal using various interfaces and lines, and performs various functions of the mobile terminal and processes data by operating or executing software programs and/or modules stored in the memory 109 and calling data stored in the memory 109, thereby performing overall monitoring of the mobile terminal. Processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 110.

The mobile terminal 100 may further include a power supply 111 (e.g., a battery) for supplying power to various components, and preferably, the power supply 111 may be logically connected to the processor 110 via a power management system, so as to manage charging, discharging, and power consumption management functions via the power management system.

Although not shown in fig. 1, the mobile terminal 100 may further include a bluetooth module or the like, which is not described in detail herein.

In order to facilitate understanding of the embodiments of the present invention, a communication network system on which the mobile terminal of the present invention is based is described below.

Referring to fig. 2, fig. 2 is an architecture diagram of a communication Network system according to an embodiment of the present invention, where the communication Network system is an LTE system of a universal mobile telecommunications technology, and the LTE system includes a UE (User Equipment) 201, an E-UTRAN (Evolved UMTS Terrestrial Radio Access Network) 202, an EPC (Evolved Packet Core) 203, and an IP service 204 of an operator, which are in communication connection in sequence.

Specifically, the UE201 may be the terminal 100 described above, and is not described herein again.

The E-UTRAN202 includes eNodeB2021 and other eNodeBs 2022, among others. Among them, the eNodeB2021 may be connected with other eNodeB2022 through backhaul (e.g., X2 interface), the eNodeB2021 is connected to the EPC203, and the eNodeB2021 may provide the UE201 access to the EPC 203.

The EPC203 may include an MME (Mobility Management Entity) 2031, an HSS (Home Subscriber Server) 2032, other MMEs 2033, an SGW (Serving gateway) 2034, a PGW (PDN gateway) 2035, and a PCRF (Policy and Charging Rules Function) 2036, and the like. The MME2031 is a control node that handles signaling between the UE201 and the EPC203, and provides bearer and connection management. HSS2032 is used to provide registers to manage functions such as home location register (not shown) and holds subscriber specific information about service characteristics, data rates, etc. All user data may be sent through SGW2034, PGW2035 may provide IP address assignment for UE201 and other functions, and PCRF2036 is a policy and charging control policy decision point for traffic data flow and IP bearer resources, which selects and provides available policy and charging control decisions for a policy and charging enforcement function (not shown).

The IP services 204 may include the internet, intranets, IMS (IP Multimedia Subsystem), or other IP services, among others.

Although the LTE system is described as an example, it should be understood by those skilled in the art that the present invention is not limited to the LTE system, but may also be applied to other wireless communication systems, such as GSM, CDMA2000, WCDMA, TD-SCDMA, and future new network systems.

Based on the above mobile terminal hardware structure and communication network system, the present invention provides various embodiments of the method.

Example one

Fig. 3 is a flowchart of a first embodiment of an audio transmission control method according to the present invention. An audio transmission control method, the method comprising:

s1, creating a set queue for recording the received audio data;

s2, acquiring audio data to be played in the set queue;

s3, when the audio data to be played are extracted, eliminating the extracted audio data in the aggregate queue;

and S4, when the time delay of audio transmission in the current state obtained according to the real-time capacity of the aggregate queue exceeds a preset value, reducing the receiving quantity of the audio data in the aggregate queue to slow down the time delay of the audio transmission.

Optionally, in this embodiment, in consideration of a certain delay defect that may exist in the existing screen projection or recording scheme for controlling audio transmission, specifically, in the screen projection or recording process, there may be a case that the rhythm of audio playing and the transmission rhythm of original audio are not synchronous, and as the duration of screen projection or recording is continuously increased, the degree of this delay may be further accumulated. That is, when the screen is projected or recorded for a long time, the accumulated time delay becomes more and more obvious, and therefore, in this embodiment, the time delay is discovered and solved in time.

Optionally, in this embodiment, by splitting a separate thread of a transmission decoding and consuming part of audio, introducing a technical idea of a producer and a consumer, detecting a production queue condition of the audio producer and a condition of the audio consumer in real time, and when it is determined that there may be a noticeable delay, performing a low frequency frame detection and identification on PCM (Pulse Code Modulation) data, dynamically discarding an audio frame that cannot be perceived by a user, thereby avoiding occurrence of the perceptible delay.

Optionally, in this embodiment, as described in the above example, in particular, considering that the degree of the delay of the audio data is difficult to determine, and if the delay is perceived by human subjectively, it is easy to cause inaccuracy in the determination, and particularly, when the delay is not very obvious, the embodiment needs to find the delay of the audio data in time, that is, needs to determine the transmission delay that is going to occur in a quantized manner.

Optionally, in this embodiment, as described in the above example, the producer and consumer model is adopted in this embodiment to split the entire execution stream, decode the audio into PCM data after receiving network data, and place this part, called the producer part, that is, the PCM data that can be played is produced, into a set queue List.

Alternatively, in this embodiment, as described in the above example, the part playing the PCM data is referred to as a consumer part, which is also a single thread, and this thread is only responsible for obtaining the corresponding PCM data from the List set, and then playing the PCM data through standard Qt (audio playing protocol) sound playing.

Optionally, in this embodiment, as described in the above example, since the dynamic set queue List is set, it may be determined whether the current audio is a situation that a delay may be about to occur or a delay may not occur by detecting a capacity size in the set of the List.

Optionally, in this embodiment, as described in the above example, through the above steps, the current audio delay condition may be estimated from the capacity size of the List, so to further quantify the above determination scheme, a corresponding threshold T may be set for the size of the List, and if the size of the List > the threshold T, it is determined that a delay may be imminent in the currently apparent audio, at this time, the subsequent receiving amount of the audio data in the aggregation queue needs to be reduced, so as to slow down the delay that may exist in the audio transmission.

The present embodiment has an advantageous effect in that by creating an aggregation queue for recording received audio data; then, audio data to be played are obtained in the set queue; when the audio data to be played are extracted, rejecting the extracted audio data in the set queue; and when the time delay of audio transmission in the current state obtained according to the real-time capacity of the aggregate queue exceeds a preset value, reducing the receiving quantity of the audio data in the aggregate queue so as to slow down the time delay of the audio transmission. The method and the device realize a humanized audio transmission control scheme, can timely find the possible time delay in the screen recording or screen projection process, accurately judge the degree of the time delay, provide an imperceptible time delay judgment and processing scheme, and enhance the user experience.

Example two

Fig. 4 is a flowchart of a second embodiment of the audio transmission control method of the present invention, and based on the above embodiment, the creating of the aggregation queue for recording the received audio data includes:

s11, generating a first thread for receiving the audio data transmitted by the network and decoding the audio data;

s12, creating the aggregate queue by the first thread, and recording the decoded and not played audio data received in the current state through the aggregate queue.

Optionally, in this embodiment, a first thread is generated for receiving the audio data transmitted by the network and decoding the audio data, where the audio data may also be derived from the audio data of the system layer received in the screen recording or screen projection process;

optionally, in this embodiment, the aggregate queue is created by the first thread, and the audio data received in the current state, decoded and not played yet is recorded through the aggregate queue, where the aggregate queue records the audio data in the form of data segments or data frames, so as to facilitate the subsequent determination of the capacity size of the queue according to the number of the data segments or data frames;

optionally, in this embodiment, when the system has a plurality of screen recording or screen projection processes, the corresponding set queues are created according to different processes, so as to facilitate subsequent delay determination and delay reduction processing.

The embodiment has the advantages that the first thread is used for generating the audio data for receiving network transmission and decoding the audio data; then, the aggregate queue is created by the first thread, and the audio data received in the current state, decoded and not played yet is recorded through the aggregate queue. The establishment basis of the set queue is provided for realizing a humanized audio transmission control scheme, so that possible time delay can be found in time and the degree of the time delay can be accurately judged in the screen recording or screen projection process, an imperceptible time delay judging and processing scheme is provided, and the user experience is enhanced.

EXAMPLE III

Fig. 5 is a flowchart of a third embodiment of the audio transmission control method according to the present invention, where based on the above embodiments, the acquiring audio data to be played in the aggregate queue includes:

s21, generating a second thread for monitoring the playing state of the audio data;

s22, the audio data to be played is obtained in the set queue through the second thread.

Optionally, in this embodiment, in the process of screen recording or screen projection, the audio data may be played simultaneously, so to identify whether there is a delay in the played audio data, this embodiment generates a second thread for monitoring the playing state of the audio data;

optionally, in this embodiment, in the process of synchronously playing the audio data, the audio data to be played is obtained in the aggregate queue in real time through the second thread.

The embodiment has the advantages that the second thread for monitoring the playing state of the audio data is generated; and then, acquiring the audio data to be played in the set queue through the second thread. The reading scheme of the set queue is provided for realizing a humanized audio transmission control scheme, so that possible time delay can be found in time in the screen recording or screen projection process, the degree of the time delay can be accurately judged, an imperceptible time delay judging and processing scheme is provided, and the user experience is enhanced.

Example four

Fig. 6 is a flowchart of a fourth embodiment of an audio transmission control method according to the present invention, where based on the above embodiments, when the audio data to be played is extracted, the removing the extracted audio data in the aggregate queue includes:

s31, extracting the audio data to be played in the set queue through the second thread;

s32, eliminating the audio data to be played in the aggregate queue through the first thread.

Optionally, in this embodiment, the received audio data is buffered in the aggregation queue, then the audio data to be played is extracted in the aggregation queue through the second thread, and the audio data to be played is eliminated in the aggregation queue through the first thread;

optionally, in this embodiment, the aggregation queue may also be in another recording form, and segment marks are performed on the audio data, and then the marks are recorded by the first thread, and the buffering of the audio data is performed by the system, so that the marks of the audio data to be played are extracted in the aggregation queue by the second thread, and the marks of the audio data to be played are removed from the aggregation queue by the first thread, that is, the size of the capacity of the queue is determined by the number of the marks.

The method has the advantages that the audio data to be played are extracted from the set queue through the second thread; then, the audio data to be played is eliminated from the aggregate queue through the first thread. The dynamic monitoring scheme of the set queue is provided for realizing a humanized audio transmission control scheme, so that possible time delay can be found in time in the screen recording or screen projection process, the degree of the time delay can be accurately judged, an imperceptible time delay judging and processing scheme is provided, and the user experience is enhanced.

EXAMPLE five

Fig. 7 is a flowchart of a fifth embodiment of an audio transmission control method according to the present invention, where based on the above embodiments, when the delay of audio transmission in the current state obtained according to the real-time capacity of the aggregate queue exceeds a preset value, the reducing the receiving amount of the audio data in the aggregate queue to slow down the delay of audio transmission includes:

s41, presetting a preset capacity value corresponding to the set queue for monitoring the time delay;

s42, acquiring the real-time capacity of the set queue, and determining the size relationship between the real-time capacity and the preset capacity value.

Optionally, in this embodiment, as described in the above example, when the combination queue is used for buffering audio data itself, the present embodiment presets a preset capacity value corresponding to the aggregation queue for monitoring the time delay, where the preset capacity value is a capacity value of data;

optionally, in this embodiment, as described in the above example, when the combination queue is a flag for recording buffered audio data, the present embodiment presets a preset capacity value corresponding to the aggregation queue for monitoring the time delay, where the preset capacity value is a count value of the flag;

optionally, in this embodiment, in order to improve the detection accuracy, when the user is in a scene that is sensitive to the delay and has a high requirement on the delay, such as video and audio synchronous playing, a low preset capacity value is determined, so as to trigger a subsequent delay slowing operation in time.

The method has the advantages that the preset capacity value corresponding to the set queue for monitoring the time delay is preset; then, the real-time capacity of the set queue is obtained, and the size relation between the real-time capacity and the preset capacity value is determined. The capacity judgment scheme of the set queue is provided for realizing a humanized audio transmission control scheme, so that possible time delay can be found in time in the screen recording or screen projection process, the degree of the time delay can be accurately judged, an imperceptible time delay judgment and processing scheme is provided, and user experience is enhanced.

EXAMPLE six

Fig. 8 is a flowchart of a sixth embodiment of an audio transmission control method according to the present invention, where based on the above embodiments, when the delay of audio transmission in the current state obtained according to the real-time capacity of the aggregate queue exceeds a preset value, the method reduces the receiving amount of the audio data in the aggregate queue to slow down the delay of audio transmission, and further includes:

s43, if the real-time capacity is smaller than the preset capacity value, determining that the time delay does not exceed the preset value;

and S44, if the real-time capacity is larger than the preset capacity value, determining that the time delay exceeds the preset value.

Optionally, in this embodiment, a plurality of different preset capacity values are set, and when the real-time capacity is greater than the lowest preset capacity value, the receiving amount of the following audio data is obtained;

optionally, in this embodiment, when the received volume of the obtained next audio data is lower than a preset value, it is determined whether the real-time capacity is greater than the next lower preset capacity value;

optionally, in this embodiment, if the real-time capacity is greater than the next lowest preset capacity value, it is determined that the time delay has exceeded the preset value.

The method has the advantages that if the real-time capacity is smaller than the preset capacity value, it is determined that the time delay does not exceed the preset value; and if the real-time capacity is larger than the preset capacity value, determining that the time delay exceeds the preset value. A time delay judgment scheme based on a set queue is provided for realizing a humanized audio transmission control scheme, so that possible time delay can be found in time in the screen recording or screen projection process, the degree of the time delay can be accurately judged, an imperceptible time delay judgment and processing scheme is provided, and user experience is enhanced.

EXAMPLE seven

Fig. 9 is a flowchart of a seventh embodiment of an audio transmission control method according to the present invention, where based on the above embodiments, when the delay of audio transmission in the current state obtained according to the real-time capacity of the aggregate queue exceeds a preset value, the method reduces the receiving amount of the audio data in the aggregate queue to slow down the delay of audio transmission, and further includes:

s45, presetting a frame skipping condition for processing the received audio data;

and S46, when the time delay is determined to exceed the preset value, reducing the subsequent receiving quantity of the audio data in the set queue according to the frame skipping condition so as to slow down the time delay of the audio transmission.

Optionally, in this embodiment, an interval frame skipping scheme is adopted for playing the current audio data, specifically, each several frames apart, an audio frame is discarded, and the next frame is played without being sent to the Qt sound playing thread, so that the possibility of obvious sound break caused by large-area frame loss is reduced as much as possible.

Optionally, in this embodiment, the interval of the discarded audio frame is correspondingly determined according to the range of the capacity difference determined by the time delay;

optionally, in this embodiment, if the capacity difference determined by the time delay is in a higher difference range, the interval of the discarded audio frames is correspondingly reduced;

optionally, in this embodiment, if the capacity difference determined by the time delay is in a lower difference range, the interval of the dropped audio frames is correspondingly expanded.

The method has the advantages that a frame skipping condition for processing the received audio data is preset; then, when the time delay is determined to exceed the preset value, reducing the subsequent receiving quantity of the audio data in the aggregate queue according to the frame skipping condition so as to slow down the time delay of the audio transmission. The frame loss scheme after the time delay judgment is provided for realizing a humanized audio transmission control scheme, so that the possible time delay can be found in time in the screen recording or screen projection process, the degree of the time delay can be accurately judged, an imperceptible time delay judgment and processing scheme is provided, and the user experience is enhanced.

Example eight

Fig. 10 is a flowchart of an eighth embodiment of an audio transmission control method according to the present invention, where based on the above embodiments, when the delay of audio transmission in the current state obtained according to the real-time capacity of the aggregate queue exceeds a preset value, the method reduces the receiving amount of the audio data in the aggregate queue to slow down the delay of audio transmission, and further includes:

s47, presetting an amplitude condition for extracting the audio data according to the amplitude;

and S48, when the time delay is determined to exceed the preset value, detecting and deleting the audio data which meets the amplitude condition in the received audio data so as to reduce the subsequent receiving amount of the audio data in the aggregation queue and slow down the time delay of the audio transmission.

Optionally, in this embodiment, in the decoding thread, a sound amplitude calculation is performed on the decoded PCM data, and if the amplitude is found to be 0, the audio is obviously muted, so that the List is not required to be sent to the consumer for consumption;

optionally, in this embodiment, a corresponding amplitude condition is determined according to the data content of the current audio data, for example, when playing a game voice sensitive to audio, a lower amplitude threshold is determined as the discarded amplitude condition, so as to reduce the discarded amount and avoid the occurrence of a false discard.

The embodiment has the advantages that the amplitude condition for extracting the audio data according to the amplitude is preset; then, when it is determined that the time delay exceeds the preset value, detecting and deleting the audio data meeting the amplitude condition in the received audio data so as to reduce the subsequent receiving amount of the audio data in the aggregation queue and slow down the time delay of the audio transmission. The discarding scheme based on amplitude screening after time delay judgment is provided for realizing a humanized audio transmission control scheme, so that possible time delay can be found in time in the screen recording or screen projection process, the degree of the time delay can be accurately judged, an imperceptible time delay judgment and processing scheme is provided, and user experience is enhanced.

Example nine

Based on the above embodiments, the present invention also provides an audio transmission control apparatus, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the computer program, when executed by the processor, implements the steps of the audio transmission control method according to any one of the above.

It should be noted that the device embodiment and the method embodiment belong to the same concept, and specific implementation processes thereof are detailed in the method embodiment, and technical features in the method embodiment are correspondingly applicable in the device embodiment, which is not described herein again.

Example ten

Based on the above embodiment, the present invention also provides a computer-readable storage medium, having an audio transmission control program stored thereon, where the audio transmission control program, when executed by a processor, implements the steps of the audio transmission control method according to any one of the above.

It should be noted that the media embodiment and the method embodiment belong to the same concept, and specific implementation processes thereof are detailed in the method embodiment, and technical features in the method embodiment are correspondingly applicable in the media embodiment, which is not described herein again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. An audio transmission control method, characterized in that the method comprises:

creating an aggregate queue for recording the received audio data;

acquiring audio data to be played in the set queue;

2. The audio transmission control method according to claim 1, wherein the creating of the aggregate queue for recording the received audio data includes:

3. The audio transmission control method according to claim 2, wherein the obtaining of the audio data to be played in the aggregate queue includes:

generating a second thread for monitoring the playing state of the audio data;

4. The audio transmission control method according to claim 3, wherein said eliminating the extracted audio data in the aggregate queue when the audio data to be played is extracted comprises:

5. The audio transmission control method according to claim 4, wherein the reducing the receiving amount of the audio data in the aggregate queue to reduce the delay of the audio transmission when the delay of the audio transmission in the current state obtained from the real-time capacity of the aggregate queue exceeds a preset value comprises:

6. The audio transmission control method according to claim 5, wherein when the delay of audio transmission in the current state obtained from the real-time capacity of the aggregate queue exceeds a preset value, the receiving amount of the audio data in the aggregate queue is reduced to slow down the delay of the audio transmission, further comprising:

7. The audio transmission control method according to claim 6, wherein when the delay of audio transmission in the current state obtained from the real-time capacity of the aggregate queue exceeds a preset value, the receiving amount of the audio data in the aggregate queue is reduced to slow down the delay of the audio transmission, further comprising:

presetting a frame skipping condition for processing the received audio data;

8. The audio transmission control method according to claim 7, wherein when the delay of audio transmission in the current state obtained from the real-time capacity of the aggregate queue exceeds a preset value, the receiving amount of the audio data in the aggregate queue is reduced to slow down the delay of the audio transmission, further comprising:

9. An audio transmission control apparatus, characterized in that the apparatus comprises a memory, a processor and a computer program stored on the memory and executable on the processor, which computer program, when executed by the processor, implements the steps of the audio transmission control method according to any one of claims 1 to 8.

10. A computer-readable storage medium, characterized in that an audio transmission control program is stored thereon, which when executed by a processor implements the steps of the audio transmission control method according to any one of claims 1 to 8.