WO2022179246A1 - Audio playback method, device and system - Google Patents

Audio playback method, device and system Download PDF

Info

Publication number
WO2022179246A1
WO2022179246A1 PCT/CN2021/136897 CN2021136897W WO2022179246A1 WO 2022179246 A1 WO2022179246 A1 WO 2022179246A1 CN 2021136897 W CN2021136897 W CN 2021136897W WO 2022179246 A1 WO2022179246 A1 WO 2022179246A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
playback
time point
playback device
expected
Prior art date
Application number
PCT/CN2021/136897
Other languages
French (fr)
Chinese (zh)
Inventor
彭正元
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022179246A1 publication Critical patent/WO2022179246A1/en

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • G11B2020/10537Audio or video recording
    • G11B2020/10546Audio or video recording specifically adapted for audio data

Definitions

  • the present application relates to the field of audio technology, and in particular, to an audio playback method, device, and system.
  • Smart devices can be connected to audio playback devices (such as headphones, speakers, etc.) through wireless communication methods such as Bluetooth and Wi-Fi, and play audio content through the audio playback devices.
  • the smart device plays the video screen. Due to the difference between the crystal oscillator of the smart device and the audio playback device, the playback speed of the two devices is different, and the video screen played by the smart device may appear out of sync with the audio content played by the audio playback device, especially after a long period of time. More obvious, resulting in poor user experience.
  • the present application provides an audio playback method, device and system.
  • the technical solution provided by the present application can make the playback progress of the audio playback device and the audio source device (ie, the smart device) consistent, and improve the user experience, especially the listening experience.
  • an audio playback method is provided, which is applied to a first audio playback device, and the first audio playback device communicates wirelessly with an audio source device.
  • the method includes: receiving audio data sent by an audio source device; dividing the audio data into N audio segments; buffering the N audio segments; wherein, according to a first adjustment coefficient, an expected playback time point of each audio segment is obtained ; Play each audio fragment in turn; periodically collect the current number of cached audio fragments and the collection time point corresponding to the current number; after the period of periodic collection reaches the preset period, or After the number of collections reaches the preset number of times, the second adjustment coefficient is obtained according to the current quantity collected each time and the collection time point corresponding to the current quantity collected each time; the second adjustment coefficient is obtained according to the second adjustment coefficient.
  • Estimated playback time point play subsequent audio segments in sequence; wherein, N is a positive integer greater than or equal to 2; the first adjustment coefficient is a preset coefficient.
  • the change trend of the data amount of the buffer area of the first audio playback device reflects the deviation of the playback speed of the audio source device and the first audio playback device. Therefore, according to the change trend of the data volume of the buffer area, the time point at which each audio fragment in the audio data is expected to be played (referred to as the expected playback time point) can be adjusted to synchronize the playback of the first audio playback device and the audio source device. effect of speed. In this way, data overflow or exhaustion in the buffer area of the first audio playback device can be avoided, so as to avoid the situation that the sound is stuck or popped when the audio is played, and the listening experience of the externally played audio can be improved.
  • the method before receiving the audio data sent by the audio source device, the method further includes: receiving an instruction to play audio sent by the audio source device.
  • the second adjustment coefficient is obtained according to the current quantity collected each time and the collection time point corresponding to the current quantity collected each time; Perform linear fitting at the acquisition time point corresponding to the current number of , to obtain the first slope; and obtain the second adjustment coefficient according to the first slope.
  • discrete points are drawn on a two-dimensional plane.
  • each discrete point drawn is used to represent the number of the first audio segments collected at the corresponding collection time point.
  • a linear regression is performed on the discrete points to obtain a straight line, and the slope of the straight line (ie, the first slope) represents the change trend of the increase or decrease of the buffered first audio segment, that is, the increase per unit time. Or reduce the number of first audio slices.
  • the first slope When the first slope is a positive value, it indicates the number of first audio segments added per unit time, which also means that the playback speed (or delivery speed) of the audio source device is faster than the playback speed of the first audio playback device.
  • the first slope When the first slope is a negative value, it means that the number of first audio segments reduced per unit time, which also means that the playback speed (or delivery speed) of the audio source device is slower than the playback speed of the first audio playback device.
  • periodically collecting the current number of buffered audio segments and the collection time point corresponding to the current number including: when the actual playback time point and the expected playback time point of any audio segment are two
  • the first audio playback device starts to periodically collect the current number of buffered audio fragments and the collection time point corresponding to the current number; wherein, the actual playback time of the audio fragment The point is the expected output time point of the speaker of the audio fragment; the expected output time point of the speaker of the audio fragment is obtained by the first audio playback device calling the interface of the audio output driver of the first audio playback device to query.
  • an audio playback method is provided, which is applied to a first audio playback device, and the first audio playback device communicates wirelessly with an audio source device.
  • the method includes: receiving audio data sent by an audio source device; dividing the audio data into N audio segments; buffering the N audio segments; wherein, according to a first adjustment coefficient, an expected playback time point of each audio segment is obtained ; Play each audio segment in turn; after the absolute value of the difference between the actual playback time point of the audio segment and the expected playback time point is greater than the preset threshold, adjust the number of cached audio segments;
  • the actual playback time point is the expected output time point of the speaker of the audio fragment; the expected output time point of the speaker of the audio fragment is obtained by the first audio playback device calling the interface of the audio output driver of the first audio playback device;
  • N is greater than or equal to 2 is a positive integer; the first adjustment coefficient is a preset coefficient.
  • a method for adjusting the playback speed of the first audio playback device is provided, which can be consistent with the playback speed or delivery speed of the audio source, which is beneficial to keep the playback progress or delivery schedule consistent with the audio source for a long time.
  • the absolute value of the difference between the actual playback time point of the audio segment and the expected playback time point is greater than a preset threshold
  • adjust the number of buffered audio segments including: in the audio segment If the absolute value of the difference between the actual playback time point and the expected playback time point is greater than the preset threshold, and the difference is a negative value, add the first number of audio fragments; between the actual playback time point of the audio fragment and the expected playback time point
  • the first number of audio segments are deleted.
  • the first quantity is related to the quotient of the absolute value of the difference divided by the playing duration of the audio segment.
  • the first quantity is the quotient of the absolute value of the difference divided by the playing duration of the audio segment.
  • the added first number of audio segments are mute data.
  • the playback speed of the first audio playback device is adjusted.
  • the absolute value of the difference between the actual playback time point of the audio segment and the expected playback time point is less than or equal to a preset threshold, collect the actual playback time point of the audio segment, and the audio segment The difference between the actual playback time point and the expected playback time point of the film; after the number of collections reaches the preset number of times, or after the collection time reaches the preset time, for the actual playback time point of each collection, each collection Perform linear fitting on the difference corresponding to the actual playback time point of , to obtain the second slope; obtain the current playback speed of the first audio playback device; obtain the adjusted playback speed according to the current playback speed and the second slope; Playback speed, play subsequent audio segments in sequence.
  • the first audio playback device is connected to a second audio playback device, and the method further includes: sending N audio segments to the second audio playback device.
  • the first audio playback device can play audio together with the second audio playback device.
  • the playback of the second audio playback device and the audio source device is also realized. Synchronize.
  • the second audio playback device can also adjust the playback speed of the second audio playback device by using the same method for adjusting the playback speed as the first audio playback device.
  • the method before the first audio playback device plays the first audio segment, the method further includes: sending an instruction to start playing the audio segment to the second audio playback device.
  • a first audio playback device in a third aspect, includes a processor, an audio output device, and a memory, the audio output device and the memory are both coupled to the processor, and the memory is used for storing a computer program, and when the computer program is executed by the processor, the first audio playback device is executed.
  • the first aspect and the method in any possible implementation manner of the first aspect, or the second aspect and the method in any possible implementation manner of the second aspect.
  • an apparatus is provided.
  • the apparatus is included in a first audio playback device, and the apparatus has the function of implementing the behavior of the first audio playback device in any of the above-mentioned aspects and possible implementation manners of the above-mentioned aspects.
  • This function can be implemented by hardware or by executing corresponding software by hardware.
  • the hardware or software includes at least one module or unit corresponding to the above-mentioned functions. For example, a communication module or unit, a processing module or unit, and a playback module or unit, etc.
  • a computer-readable storage medium includes a computer program that, when the computer program runs on the first audio playback device, causes the first audio playback device to perform the above-mentioned first aspect and any possible implementation of the first aspect. method, or perform the method in the second aspect and any possible implementation manner of the second aspect.
  • a computer program product When the computer program product runs on the computer, it causes the computer to execute the method in the first aspect and any possible implementation of the first aspect, or execute the second aspect and any possible implementation of the second aspect. method in method.
  • a chip system in a seventh aspect, includes a processor, and when the processor executes an instruction, the processor executes the first aspect and the method in any possible implementation manner of the first aspect, or executes any of the second aspect and the second aspect. method in one possible implementation.
  • a system in an eighth aspect, includes an audio source playback device and a first audio playback device, where the first audio playback device executes the first aspect and the method in any possible implementation manner of the first aspect, or executes the second aspect and the first aspect.
  • the method in any possible implementation manner of the two aspects.
  • system further includes a second audio playback device, and the second audio playback device executes the method in the second aspect and any possible implementation manner of the second aspect.
  • FIG. 1 is a schematic diagram of a scene of an audio playback method provided by an embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of an audio source device provided by an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of an audio playback device provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of an audio playback method provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a fitting method for a trend in the number of audio fragments buffered by an audio playback device according to an embodiment of the present application
  • FIG. 6 is a schematic flowchart of an audio playback method provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a fitting method of a variation trend of the difference between an actual playback time point and an expected playback time point of an audio playback device provided by an embodiment of the present application;
  • FIG. 8 is a schematic flowchart of an audio playback method provided by an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of an audio playback method provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a chip system provided by an embodiment of the present application.
  • A/B can mean A or B.
  • “And/or” in this document is only an association relationship to describe the associated objects, indicating that three kinds of relationships can exist.
  • a and/or B can mean that A exists alone, A and B exist at the same time, and B exists alone.
  • first and second are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implicitly indicating the number of indicated technical features.
  • a feature defined as “first” or “second” may expressly or implicitly include one or more of that feature.
  • plural means two or more.
  • words such as “exemplary” or “for example” are used to represent examples, illustrations or illustrations. Any embodiments or designs described in the embodiments of the present application as “exemplary” or “such as” should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as “exemplary” or “such as” is intended to present the related concepts in a specific manner.
  • FIG. 1 is a schematic diagram of a scenario of an audio playback method provided by an embodiment of the present application.
  • FIG. 1 shows a communication system provided by an embodiment of the present application.
  • the communication system includes an audio source device 100 and an audio playback device 200 .
  • the communication system may further include an audio playback device 300 .
  • the audio source device 100 is configured to provide audio content to the audio playback device 200 .
  • the audio source device 100 in this embodiment of the present application may be, for example, a mobile phone, a tablet computer, a personal computer (PC), a personal digital assistant (PDA), a netbook, a wearable device, or an augmented reality technology.
  • augmented reality, AR augmented reality
  • virtual reality virtual reality
  • in-vehicle device smart screen, etc.
  • the specific form of the audio source device 100 is not particularly limited in this application.
  • the audio playback device 200 is configured to receive the audio content sent by the audio source device 100 and play the audio content.
  • the audio playback device 200 may be, for example, a wireless headset, a wireless speaker, a wearable device, an AR device, a VR device, etc.
  • the specific form of the audio playback device 200 is not particularly limited in this application.
  • the audio source device 100 when the audio source device 100 plays a video, it can play the screen content of the video through its own display screen, and send the audio content in the video to the audio playback device 200; the audio playback device 200 plays the audio content. .
  • the audio playback device 200 buffers the audio content received from the audio source device 100 and delays the playback.
  • the audio source device 100 and the audio playback device 200 are different devices and have hardware differences (for example, different crystal oscillator frequencies)
  • the playback speeds of the two devices will be different, thereby causing data overflow or consumption in the buffer area of the audio playback device 200.
  • the sound stutters or pops when playing audio are different devices and have hardware differences (for example, different crystal oscillator frequencies).
  • a maximum threshold and a minimum threshold for buffering data of the audio playback device 200 may be set.
  • the playback speed of the audio playback device 200 is increased according to a certain ratio.
  • the playback speed of the audio playback device 200 is reduced according to a certain ratio. Therefore, the amount of data buffered by the audio playback device 200 is kept within a preset range, and the situation of data overflow or exhaustion in the buffer area of the audio playback device 200 is reduced.
  • adjusting the playback speed of the audio playback device 200 is usually adjusted according to a fixed ratio, which does not match the actual playback speed, and the adjustment accuracy of the playback speed is not high.
  • the technical solution provided by the present application can make the playback progress of the audio playback device and the audio source device (ie, the smart device) consistent, and improve the user experience, especially the listening experience.
  • FIG. 2 shows the hardware structure of the audio source device 100 .
  • the audio source device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2.
  • Mobile communication module 150 wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone jack 170D, sensor module 180, buttons 190, motor 191, indicator 192, camera 193, display screen 194, And a subscriber identification module (subscriber identification module, SIM) card interface 195 and so on.
  • SIM subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.
  • the structures illustrated in the embodiments of the present invention do not constitute a specific limitation on the audio source device 100 .
  • the audio source device 100 may include more or less components than shown, or combine some components, or separate some components, or arrange different components.
  • the illustrated components may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (neural-network processing unit, NPU), etc. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor, AP
  • modem processor graphics processor
  • ISP image signal processor
  • controller video codec
  • digital signal processor digital signal processor
  • baseband processor baseband processor
  • neural-network processing unit neural-network processing unit
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the audio source device 100 .
  • the external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example to save files like music, video etc in external memory card.
  • Internal memory 121 may be used to store computer executable program code, which includes instructions.
  • the internal memory 121 may include a storage program area and a storage data area.
  • the storage program area can store an operating system, an application program required for at least one function (such as a sound playback function, an image playback function, etc.), and the like.
  • the storage data area may store data (such as audio data, phone book, etc.) created during the use of the audio source device 100 and the like.
  • the internal memory 121 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (UFS), and the like.
  • the processor 110 executes various functional applications and data processing of the audio source device 100 by executing instructions stored in the internal memory 121, and/or instructions stored in a memory provided in the processor.
  • the processor 110 may include one or more interfaces, such as including a USB interface 130, which is an interface conforming to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, and the like.
  • the USB interface 130 can be used to connect a charger to charge the audio source device 100, and can also be used to transmit data between the audio source device 100 and peripheral devices. It can also be used to connect headphones to play audio through the headphones.
  • the interface can also be used to connect other electronic devices, such as AR devices.
  • the wireless communication function of the audio source device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor, the baseband processor, and the like.
  • the mobile communication module 150 may provide a wireless communication solution including 2G/3G/4G/5G etc. applied on the audio source device 100 .
  • the wireless communication module 160 can provide applications on the audio source device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), bluetooth (BT), global navigation Satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
  • WLAN wireless local area networks
  • BT wireless fidelity
  • GNSS global navigation Satellite system
  • frequency modulation frequency modulation, FM
  • NFC near field communication technology
  • infrared technology infrared, IR
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110 , perform frequency modulation on it, amplify it, and convert it into electromagnetic
  • the audio source device 100 may establish a communication connection with the audio playback device 200 through the wireless communication module 160, and send the audio content to be played to the audio playback device 200 through a wireless connection, and the audio playback device 200 play.
  • the to-be-played audio content may be sound content in the video, or may be independent audio, such as music.
  • the audio playback device 200 may further forward the audio content to the audio playback device 300 again, and the audio playback device 200 and the audio playback device 300 play together the audio content.
  • the audio playback device 200 is the master playback device
  • the audio playback device 300 is the slave playback device
  • the number of audio playback devices 300 is one or more.
  • the audio source device 100 implements a display function through a GPU, a display screen 194, an application processor, and the like.
  • the audio source device 100 may implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, an application processor, and the like. Such as music playback, recording, etc.
  • FIG. 3 shows the hardware structure of the audio playback device 200 .
  • the audio playback device 200 may include a processor 210, a memory 220, a wireless communication module 230, an antenna 240, a speaker 250, a power module 260, and the like. It can be understood that the structures illustrated in the embodiments of the present invention do not constitute a specific limitation on the audio playback device 200 . In other embodiments of the present application, the audio playback device 200 may include more or less components than shown, or combine some components, or separate some components, or arrange different components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 210 may include one or more processing units.
  • the processor 210 may include a distribution module, a playback module, a cache module, an adjustment coefficient calculation module, a broadcast speed module, and the like.
  • the processor 210 may further include a clock synchronization module and the like. The specific functions of each module will be described in detail below with reference to specific embodiments. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • Memory 220 may be used to store computer-executable program code, which includes instructions.
  • memory 220 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, universal flash storage (UFS), and the like.
  • the processor 210 executes various functional applications and data processing of the audio playback device 200 by executing the instructions stored in the memory 220 and/or the instructions stored in the memory provided in the processor.
  • the wireless communication function of the audio playback device 200 may be implemented by the antenna 240 , the wireless communication module 230 , the modem processor in the processor 210 , the baseband processor, and the like.
  • the wireless communication module 230 can provide a wireless communication solution including WLAN (eg Wi-Fi network), BT, GNSS, FM, NFC, IR, etc. applied on the audio playback device 200 .
  • the wireless communication module 230 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 230 receives the electromagnetic wave via the antenna 2 , modulates and filters the electromagnetic wave signal, and sends the processed signal to the processor 210 .
  • the wireless communication module 230 can also receive the signal to be sent from the processor 210 , perform frequency modulation on it, amplify it, and convert it into electromagnetic waves for radiation through the antenna 240 .
  • the audio playback device 200 may establish a communication connection with the audio source device 100 through the wireless communication module 230 , receive audio content sent by the audio source device 100 through the wireless connection, and play through the speaker 250 .
  • the audio playback device 200 may also establish a communication connection with other audio playback devices 300 through the wireless communication module 230 . Then, the audio playback device 200 can forward the audio content to the audio playback device 300 again, and the audio playback device 200 and the audio playback device 300 play together the audio content to play a sound with stereo effect.
  • the audio playback device 200 is a master playback device
  • the audio playback device 300 is a slave playback device
  • the number of audio playback devices 300 is one or more.
  • the power supply module 260 provides power for various components of the audio playback device 200, such as power supply for the processor 210, the memory 220, the wireless communication module 230, and the like.
  • the structure of the audio playback device 300 may refer to the audio playback device 200.
  • the structure of the audio playback device 300 may be the same as or different from that of the audio playback device 200, which is not limited in this application.
  • the technical solutions provided in the embodiments of the present application are applicable to the communication system shown in FIG. 1 , and the audio source device 100 has the structure shown in FIG. 2 , and the audio playback device 200 has the structure shown in FIG. 3 .
  • the audio playback device 200 collects the size of the data volume of the buffer area at multiple adjacent time points (or moments), and calculates the change trend of the buffer area data volume by means of linear fitting .
  • the change trend of the data amount of the buffer area reflects the deviation of the playback speed of the audio source device 100 and the audio playback device 200 . Therefore, according to the change trend of the data volume of the buffer area, the time point at which each audio fragment in the audio content is expected to be played (referred to as the expected playback time point) can be adjusted to synchronize the playback of the audio playback device 200 and the audio source device 100. The effect of speed improves the user's listening experience.
  • the playback speed of the audio playback device 200 can also be adjusted so that the playback speed of the audio playback device 200 is the same as that of the audio source device.
  • the playback speed of 100 remains the same.
  • the audio playback device 200 may collect the expected playback time point in the audio segment, and the deviation from the corresponding time point when the audio playback device 200 actually starts playing the audio segment (referred to as the actual playback time point of the audio segment). , and then calculate the variation trend of the deviation by linear fitting; and use the variation trend to adjust the playback speed of the audio playback device 200 so that the playback speed of the audio playback device 200 is consistent with the playback speed of the audio source device 100 .
  • the audio source device 100 when the audio source device 100 plays a video, it can play the screen content of the video through its own display screen, and send the audio content in the video to the audio playback device 200 and the audio playback device 300.
  • the device 200 and the audio playback device 300 jointly play audio content.
  • the audio playback device 200 is the master playback device
  • the audio playback device 300 is the slave playback device.
  • the audio playback device 200 and the audio playback device 300 are devices of the same manufacturer, and can perform clock synchronization. Even if clock synchronization is performed, the audio playback device 200 and the audio playback device 300 are still different devices, and there are still differences in playback speed due to hardware differences (eg, different crystal oscillator frequencies). When the playback time is prolonged, the playback progress of the two devices is still out of sync.
  • the audio playback device 300 can also collect the expected playback time point in the audio segment, the deviation from the time point when the audio playback device 300 actually starts to play the audio segment (referred to as the actual playback time point), and then perform linear fitting.
  • the variation trend of the deviation is calculated by using the variation trend;
  • the audio source device 100 plays audio (eg, music)
  • the audio may be directly sent to the audio playback device 200 , and the audio playback device 200 plays the audio content.
  • the speed at which the audio source device 100 sends the audio stream to the audio playback device 200 (also referred to as the delivery speed) is different from the playback speed of the audio playback device 200, which may still cause data overflow or consumption of the audio playback device 200 in the buffer area.
  • the audio playback device 200 needs to adjust the playback speed to be consistent with the delivery speed of the audio source device 100, so as to avoid data overflow or exhaustion in the buffer area of the audio playback device 200, thereby avoiding the occurrence of sound stuttering or popping when playing audio. .
  • the audio source device 100 plays audio (for example, music)
  • the audio can be directly sent to the audio playback device 200 and the audio playback device 300, and the audio playback device 200 and the audio playback device 300 jointly play the audio. content.
  • the audio playback device 200 needs to adjust the playback speed to be consistent with the delivery speed of the audio source device 100
  • the audio playback device 300 also needs to adjust the playback speed to be consistent with the delivery speed of the audio source device 100 .
  • FIG. 4 shows the flow of an audio playback method provided by the present application.
  • the audio playback method may include:
  • the audio source device 100 receives an instruction to play audio.
  • the audio source device 100 is connected with at least one audio playback device, for example, the audio playback device 200 .
  • the audio source device 100 may establish a wireless connection with the audio playback device 200 through the wireless communication module 160, and the adopted wireless connection mode may be, for example, Bluetooth, WLAN, NFC, or the like.
  • the audio source device 100 can play the screen content of the video, and send the audio content (that is, the audio stream) of the video to the audio playback device 200 for playback by the audio playback device 200 , that is, to realize the external playback of audio.
  • the user operates on the audio source device 100 and instructs to start playing pure audio (eg, music, recording, etc.), and the audio source device 100 sends the audio stream to the audio playback device 200 .
  • the screen content on the audio source device 100 side should be consistent with the playback speed of the audio content on the audio playback device 200 side.
  • the speed at which the audio source device 100 delivers the audio stream to the audio playback device 200 should be consistent with the playback speed of the audio content on the audio playback device 200 side.
  • the audio source device 100 sends an audio stream to the audio playback device 200 .
  • the audio playback device 200 encodes and decodes the received audio stream.
  • the distribution module receives the audio stream sent by the audio source device 100, and encodes the received audio stream. Decode to obtain audio data that conforms to its own playback format.
  • the encoding and decoding are related to the number of channels, the number of sampling bits, and the sampling frequency of the audio playback device 200 .
  • the audio playback device 200 segments the encoded and decoded audio stream, and calculates the expected playback time point of each audio segment according to the adjustment coefficient.
  • step S404 may specifically include steps S404a to S404e.
  • the adjustment coefficient calculation module returns the current adjustment coefficient to the distribution module.
  • the adjustment coefficient calculation module can periodically update the value of the adjustment coefficient, and the distribution module calculates the expected playback time point of each audio segment according to the current latest adjustment coefficient. For the calculation process of updating the adjustment coefficient by the adjustment coefficient calculation module, reference may be made to the following step S406.
  • the initial value of the adjustment coefficient can be set to 1.
  • the distribution module segments the encoded and decoded audio stream, and calculates the expected playback time point of each audio segment.
  • each audio fragment after fragmentation may be:
  • index is the number of the audio segment, which starts from 1 and increases sequentially.
  • len is the data length of the audio fragment.
  • the playback duration of the audio segment is a preset value, for example, 10ms (millisecond, millisecond). In other words, the data length of the audio slice is a fixed value.
  • the playtime is the expected playback time point of the audio segment, for example, the unit is ⁇ s (microsecond, microsecond).
  • the expected playback time point of the first audio segment that is, audio segment 1#
  • current time current time+preset delay time (for example, 1s).
  • the preset delay time may enable the audio playback device 200 to cache the data of the audio segment, so as to prevent abnormal playback caused by network transmission jitter.
  • Expected playback time point of the N th audio segment expected playback time point of the first audio segment + (N-1)*audio segment playback duration*1000*adjustment coefficient.
  • the expected playback time point of the audio segment in this application is determined according to the expected playback time point of the first audio segment, the number of the audio segment, and the adjustment coefficient.
  • the adjustment coefficient is dynamically changed according to the number of audio segments buffered in the audio playback device 200 .
  • the following step S406 will describe the calculation method of the adjustment coefficient in detail, and will not be described here for the time being.
  • the initial value of the adjustment coefficient is 1.
  • the current time is 2020/11/11 00:00:00.000 000.
  • the preset delay time is 1s, then the expected playback time point of the first audio segment is 2020/11/11 00:00:01.000 000.
  • the data information of the first audio segment is shown in Table 1:
  • index 1 len 3840 playTime 2020/11/11 00:00:01.000 000
  • the data information of the second audio segment is shown in Table 2:
  • index 2 len 3840 playTime 2020/11/11 00:00:01.010 000
  • the unit of the audio segment playback time is ms (milliseconds), and the expected playback time point of the audio segment is ⁇ s (microseconds).
  • the distribution module may periodically request the current adjustment coefficient from the adjustment coefficient calculation module, so as to calculate the expected playback time point of the audio segment according to the current adjustment coefficient.
  • the distribution module may also request the current adjustment coefficient from the adjustment coefficient calculation module after receiving the audio stream data of a specific amount of data, so as to calculate the expected playback time point of the audio segment according to the current adjustment coefficient.
  • the updated adjustment coefficient may also be sent to the distribution module, so that the distribution module calculates the expected playback time point of the audio segment according to the updated adjustment coefficient.
  • the distribution module can passively receive the latest adjustment coefficient sent by the adjustment coefficient calculation module to calculate the expected playback time point of the audio segment.
  • the distribution module sends each audio segment to the cache module.
  • the distribution module sends the generated audio segments to the cache module in turn for caching.
  • Each audio segment carries an estimated playback time point.
  • the cache module caches each audio fragment.
  • the audio source device 100 starts audio playback.
  • the audio playback device 200 includes a distribution module, an adjustment coefficient calculation module, a playback module, and a cache module as an example for description.
  • step S405 may specifically include step S405a and step S405b.
  • the distribution module when the current time is later than or equal to the expected playback time point of the first audio segment, the distribution module notifies the playback module to start audio playback.
  • the playback module reads the data of the first audio segment and the data of the subsequent audio segments from the cache module, and starts to play each audio segment in sequence.
  • the playback module reads the data of the audio segment from the cache module
  • the data of the audio segment is written into an audio output driver in the playback module, such as an advanced Linux sound architecture (ALSA).
  • ALSA advanced Linux sound architecture
  • the expected output time corresponding to the currently written audio segment is obtained through the interface of the audio output driver, and the expected output time may be considered as the actual playback time point of the audio segment.
  • the playback module can notify the cache module to delete it.
  • Part of the audio is fragmented so that the playback progress of the audio playback device 200 is the same as the playback progress of the audio source device 100 as soon as possible.
  • the number of audio segments deleted by the audio playback device 200 may be determined according to the quotient of dividing the absolute value of the difference by the playback duration of the audio segments (ie, the data length of each audio segment).
  • the number is equal to the absolute value of the difference divided by the playback duration of the audio segment. business. If the quotient of the absolute value of the difference divided by the playing duration of the audio segment is not an integer, the quotient may be rounded, and the integer obtained after the rounding will be used as the number of deleted audio segments. Among them, the rounding method may be rounding, rounding up, rounding down, etc.
  • the playback module can notify the cache module to increase Partial audio fragmentation.
  • the added audio segment may be muted audio data, or may be copying the currently written audio segment data or other audio data. This can be equivalent to the audio playback device 200 playing subsequent audio segments after waiting for a corresponding time, so that the audio playback device 200 and the audio source device 100 have the same playback progress.
  • the data of the added audio segment is all 0, that is, the mute data.
  • the expected playtime (playtime) carried by each audio segment here is determined according to the number of the audio segment, the expected playback time of the first audio segment, and the adjustment coefficient.
  • the adjustment coefficient is dynamically changed according to the number of audio segments buffered in the audio playback device 200 . The calculation and update process of the adjustment coefficient will be described in detail below.
  • the audio playback device 200 determines that the current time is the expected playback time point of the first audio fragment, after the audio playback is started, it also records the number of audio fragments buffered in the audio playback device 200 and the time variation characteristics, And calculate the adjustment coefficient according to the change characteristic. That is to say, when the audio playback device 200 performs step S405, step S406 is also performed, and the details are as follows:
  • the audio source device 100 records the variation characteristics of the number and time of the audio segments buffered in the audio playback device 200, and calculates an adjustment coefficient according to the variation characteristics.
  • the audio playback device 200 includes a distribution module, an adjustment coefficient calculation module, a playback module, and a cache module as an example for description.
  • step S406 may specifically include steps S406a to S406d.
  • the playback module After receiving the notification of starting audio playback, the playback module notifies the adjustment coefficient calculation module to periodically collect the number of audio fragments in the cache module.
  • the playback module may immediately notify the adjustment coefficient calculation module to periodically collect the number of audio fragments in the cache module after receiving the notification for starting audio playback, or after receiving the notification for starting audio playback After a period of time (for example, 1 second), the adjustment coefficient calculation module is notified to start the collection.
  • the playback module may further notify the player when detecting that the absolute value of the difference between the expected playback time point of an audio segment and the actual playback time point is greater than the preset threshold A (or other thresholds) The adjustment coefficient calculation module starts to collect. That is to say, the present application does not specifically limit the timing when the adjustment coefficient calculation module starts to periodically collect the number of audio fragments in the buffer module.
  • the adjustment coefficient calculation module periodically collects the number of audio fragments in the buffer module.
  • the adjustment coefficient calculation module may set a timer, and then periodically collect the number of audio fragments currently stored in the buffer module from the buffer module, for example, once every 200 ⁇ s.
  • the adjustment coefficient calculation module may also instruct the cache module to periodically report the number of audio fragments stored by itself. That is, the adjustment coefficient calculation module sends an indication of the number of audio fragments that are periodically collected and buffered to the buffering module. After receiving the instruction, the cache module sets a timer and periodically reports the number of audio fragments stored by itself.
  • the adjustment coefficient calculation module stores the number of the collected audio fragments and the collection time point in the cache module.
  • the adjustment coefficient calculation module records the collection time point and the number of audio fragments in the buffer module collected at each collection time point in the sampling queue 1, and the content stored in the sampling queue 1 is shown in Table 3.
  • the adjustment coefficient calculation module calculates and updates the adjustment coefficient according to the data of the audio fragment in the collected buffer module and the collection time point.
  • the preset condition may be that the number of data pieces in the sampling queue 1 (for example, each row of data in Table 3 is one piece of data) reaches a predetermined number, such as 100 pieces, or it may be the preset time after the last calculation and update of the adjustment coefficient segment (eg 3 minutes).
  • the adjustment coefficient calculation module can perform linear fitting on the data in the sampling queue 1 to obtain the change trend of the audio fragments in the sampling queue 1 (that is, in the cache module), that is, the number of audio fragments in the sampling queue 1 is the same as the number of audio fragments in the sampling queue 1. time relationship.
  • the audio source device 100 sends an audio stream to the audio playback device 200 according to its own playback progress.
  • the buffering module of the audio playback device 100 buffers the received audio stream to obtain a buffer queue.
  • the head of the buffer queue is the audio fragment obtained from the audio stream received first
  • the tail of the buffer queue is the audio fragment obtained from the audio stream received later.
  • the playback speed of the audio source device 100 (or the speed of delivering the audio stream to the audio playback device 200, referred to as the delivery speed for short) affects the increasing speed of the audio fragments at the end of the cache queue.
  • the audio playback device 200 starts audio playback, it acquires audio segments from the head of the cache queue for playback, and deletes the played audio segments.
  • the playback speed of the audio playback device 200 affects the reduction speed of the audio fragment at the head of the buffer queue.
  • the difference between the playback speed (or delivery speed) of the audio source device 100 and the playback speed of the audio playback device 200 is reflected in the changing trend of the number of audio fragments in the buffer queue. Since the network transmission between the audio source device 100 and the audio playback device 200 also affects the number of audio fragments at individual time points in the buffer queue, individual abnormal data can be excluded by linear fitting.
  • each discrete point in FIG. 5 is used to represent the number of audio segments collected at the corresponding collection time point.
  • the straight line in Figure 5 is obtained after linear regression is performed on the discrete points.
  • the slope of the straight line (denoted as slope 1) represents the change trend of the increase or decrease of audio fragments in the cache module, that is, per unit time. Increase or decrease the number of audio slices within.
  • the slope 1 When the slope 1 is a positive value, it indicates the number of audio segments added per unit time, which also means that the playback speed (or delivery speed) of the audio source device 100 is faster than the playback speed of the audio playback device 200 . When the slope 1 is a negative value, it indicates the number of audio segments reduced per unit time, which also means that the playback speed (or delivery speed) of the audio source device 100 is slower than the playback speed of the audio playback device 200 .
  • Adjustment coefficient 1-(slope 1*value of audio segment playback duration/1000)
  • the value of the playback duration of the audio segment is the value when the unit of the playback duration of the audio segment is milliseconds; the value has no unit. Adjustment factor and slope 1 have no unit.
  • the adjustment coefficient calculation module obtains the latest adjustment coefficient
  • the data in the collection queue is cleared. Subsequently, the adjustment coefficient for the next cycle will be calculated according to the number of audio fragments in the cache module collected in the next cycle.
  • the expected playback time point of the Nth audio segment the expected playback time point of the first audio segment+(N-1)*audio segment playback duration*1000*adjustment coefficient formula (2)
  • the slope 1 is a positive value, it means that the playback speed (or delivery speed) of the audio source device 100 is faster than the playback speed of the audio playback device 200, then the playback progress of the audio source device 100 is also faster than that of the audio playback device 200. playback progress.
  • the adjustment coefficient is less than 1
  • the expected playback time point of the Nth audio segment is also smaller (compared to when the adjustment coefficient is 1), that is The estimated playback time of the Nth audio segment is advanced. In other words, the playback progress of the audio playback device 200 is accelerated, which facilitates catching up with the playback progress of the audio source device 100 as soon as possible.
  • the slope 1 When the slope 1 is negative, it means that the playback speed (or delivery speed) of the audio source device 100 is slower than the playback speed of the audio playback device 200, and the playback progress of the audio source device 100 is also slower than that of the audio playback device 200.
  • the adjustment coefficient is greater than 1
  • the expected playback time point of the N-th audio segment is also larger (compared to when the adjustment coefficient is 1), that is The estimated playback time of the Nth audio segment is delayed.
  • the playback progress of the audio playback device 200 is slowed down so as to be the same as the playback progress of the audio source device 100 .
  • the change trend of the number of cached audio fragments in 200 updates the adjustment coefficient, and then calculates the expected playback time point of the audio fragment according to the adjustment coefficient, which can make the playback speed of the audio playback device 200 and the playback speed (or delivery speed) of the audio source device 100. Consistent. In this way, data overflow or exhaustion in the buffer area of the audio playback device 200 can be avoided, thereby avoiding the situation of sound stuttering or popping when playing the audio, and improving the listening experience of the externally played audio.
  • the audio playback device 200 and the audio source device 100 may have inconsistent playback progress due to hardware differences (eg, different crystal oscillator frequencies) between the two devices.
  • the audio playback device 200 can also adjust its own playback speed, so that its own playback speed is consistent with the playback speed of the audio source device 100, so that the audio playback device 200 can maintain the playback progress with the audio source device 100 for a long time. Consistent.
  • the audio playback device 200 includes a distribution module, a playback module, a cache module, and a playback speed module as an example for description.
  • FIG. 6 shows the flow of another audio playback method provided by the present application.
  • the audio playback method includes steps S401-S403, S404c-S404e, S405a, S601-S611, and S406.
  • step S401-step S403, step S404c-step S404e, step S405a, and step S406, please refer to the description of the relevant content in FIG. 4, and will not be repeated here.
  • the playback module obtains the content of the audio segment from the cache module.
  • the distribution module When the current time is equal to the expected playback time point of the first audio segment, the distribution module notifies the playback module to start audio playback, and the playback module starts to sequentially read the content of the first audio segment and subsequent audio segments from the cache module .
  • the playback module writes the acquired audio segment content to the audio output driver.
  • the playback module sequentially writes the read audio segments into the audio output driver (eg ALSA) in the playback module, and plays the currently written audio segment through the audio output driver.
  • the audio output driver eg ALSA
  • the playback module invokes the interface of the audio output driver to query the speaker expected output time of the currently written audio segment, that is, the actual playback time point of the currently written audio segment.
  • the playback module After the playback module writes the audio segment to the audio output driver, the playback module also calls the interface of the audio output driver to query the speaker expected output time of the currently written audio segment, and the speaker expected output time of the audio segment can be considered to be the The actual playback time point of the audio segment.
  • the playback module calculates the difference between the actual playback time point of the audio fragment and the expected playback time point (eg, playtime) carried in the audio fragment, and determines whether the absolute value of the difference is greater than the preset threshold A.
  • the magnitude of the absolute value of the difference and a preset threshold value A can be determined.
  • the absolute value of the difference is greater than the preset threshold A, indicating that the playback progress difference between the audio playback device 200 and the audio source device 100 is relatively large.
  • Step S605 can be executed to quickly reduce the playback progress difference between the two devices.
  • the absolute value of the difference is less than or equal to the preset threshold A, indicating that the difference between the playback progress of the audio playback device 200 and the audio source device 100 is small, and steps S606 to S611 may be performed.
  • the playback speed of the audio playback device 200 is consistent with the playback speed of the audio source device 100 .
  • step S606 After the playback module calculates the difference between the actual playback time point of the audio segment and the expected playback time point (such as playtime) carried in the audio segment, it is not necessary to compare the absolute value of the difference with the preset threshold A. size, but directly execute step S606.
  • the playback module notifies the cache module to delete or add audio segments.
  • the difference between the actual playback time point of the audio segment and the expected playback time point is greater than the preset threshold A, it is further determined according to the relative size of the actual playback time point and the expected playback time point to delete one or more audio segments, or whether to delete one or more audio segments. Add one or more audio slices.
  • the playback module may notify the cache module to delete some audio fragments , so that the playback progress of the audio playback device 200 is the same as the playback progress of the audio source device 100 as soon as possible.
  • the playback module may notify the cache module to add some audio fragments .
  • the added audio segment may be muted audio data, or copy the currently written audio segment data or other audio data, which can be equivalent to the audio playback device 200 playing the subsequent audio segment after waiting for a corresponding time.
  • the audio playback device 200 and the audio source device 100 have the same playback progress.
  • the playback module monitors the difference between the actual playback time point of the subsequent audio segment and the expected playback time point, determines whether the difference is greater than the preset threshold A, and further adopts a corresponding method.
  • the playback module sends the actual playback time point of the audio segment to the playback speed module and the difference value corresponding to it.
  • the playback speed of the audio playback device 200 needs to be adjusted, it is necessary to calculate the actual playback time point of the audio playback device 200 and the change trend of the difference. Therefore, when the playback module determines that the difference is less than the preset threshold A, the difference and The actual playback time point corresponding to the difference is sent to the playback speed module.
  • the playback speed module stores the difference between the actual playback time point of the audio segment and its corresponding value.
  • the playback speed module records the difference between the actual playback time point of the received audio segment and its corresponding difference in the sampling queue 2, and the content stored in the sampling queue 2 is shown in Table 4.
  • the playback speed module calculates the deviation of the playback speed according to the difference between the actual playback time point of the audio segment and its corresponding value.
  • the playback speed module determines that a certain condition is met, the deviation between the playback speeds of the audio playback device 200 and the audio source device 100 may be calculated according to the data in the sampling queue 2 .
  • the certain condition may be, for example, that the number of data pieces in the sampling queue 2 (for example, each row of data in Table 4 is one piece of data) reaches a predetermined number, for example, 100 pieces of data.
  • the adjustment coefficient calculation module may perform linear fitting on the data in the sampling queue 2 to obtain a variation trend of the deviation between the playback speeds of the audio playback device 200 and the audio source device 100 .
  • the discrete points are drawn on the two-dimensional plane. After performing linear regression on the discrete points, a kneaded straight line is obtained.
  • the slope of the straight line (referred to as slope 2) represents the deviation between the actual playback speed and the expected playback speed, that is, how much time the deviation per unit time is.
  • slope 2 of the fitted straight line is a positive value, it means that the actual playback speed of the audio playback device 200 is slower than the expected playback speed, and the playback speed of the audio playback device 200 needs to be increased.
  • the playback speed module calculates the target playback speed according to the deviation of the playback speed.
  • the target playback speed is the desired playback speed of the audio playback device 200 .
  • formula (3) can be used to calculate the target playback speed:
  • Target playback speed current playback speed * (1 + slope 2) formula (3)
  • the preset threshold value B can also be set.
  • the absolute value of the slope 2 is less than the preset threshold B, it can be considered that the difference between the actual playback time point of the audio playback device 200 and the expected playback time point is small, and the playback speed of the audio playback device 200 does not need to be adjusted.
  • formula (3) is used to adjust the playback speed of the audio playback device 200 .
  • the playback speed module sends the target playback speed to the playback module.
  • the playback module modifies the playback speed value of the audio output driver to the target playback speed.
  • the playback module adjusts the playback speed of the audio output driver to (1+slope 2) times the current speed.
  • the playback module monitors the difference between the actual playback time point of the subsequent audio segment and the expected playback time point, determines whether the difference is greater than the preset threshold A, and further adopts a corresponding method.
  • the playback speed of the audio playback device 200 is adjusted to be consistent with the playback speed (or delivery speed) of the audio source device 100.
  • the speeds of the devices are different, causing the audio playback device 200 to frequently add or delete audio segments in the cache.
  • the technical solution described in FIG. 4 and the technical solution described in FIG. 6 may be combined. That is to say, first calculate the adjustment coefficient through the change trend of the number of audio fragments buffered in the audio playback device 200, and calculate the expected playback time point of each audio fragment according to the adjustment coefficient, that is, align the audio playback device 200 and the audio source device. 100 The point in time when the audio segment is expected to start playing. Then, the audio playback device 200 can further adjust its own playback speed, so that its own playback speed is consistent with the playback speed of the audio source device 100, so that the audio playback device 200 can maintain the playback progress with the audio source device 100 for a long time. Consistent.
  • the audio playback device 200 is further connected with other audio playback devices, for example, the audio playback device 300 .
  • the audio content is played jointly by the audio playback device 200 and the audio playback device 300 .
  • the playback speed of the audio playback device 300 should also be consistent with the playback speed (or delivery speed) of the audio source device 100. be consistent.
  • FIG. 8 shows the flow of another audio playback method provided by the present application.
  • the audio playback method includes steps S401 to S406 , and steps S801 to S805 .
  • steps S401 to S406 please refer to the related content of the process in FIG. 4 .
  • the differences from the flow in Figure 4 are highlighted here.
  • a wired connection or a wireless connection may be established between the audio playback device 200 and the audio playback device 300, where the wireless connection may be, for example, Bluetooth, WLAN, NFC, or the like.
  • the audio playback device 200 is time synchronized with the audio playback device 300.
  • audio playback device 200 and audio playback device 300 establish a wireless connection. Before the audio playback device 200 and the audio playback device 300 play audio content together, the audio playback device 200 and the audio playback device 300 perform time synchronization. For example, after the audio playback device 200 receives the audio stream (ie, step S402 ), or after the audio playback device 200 sends the audio segment to the audio playback device 300 (ie, at step S802 ), or the audio playback device 200 sends the audio After the playback device 300 sends to start the audio playback (ie, step S804 ), the audio playback device 200 and the audio playback device 300 perform time synchronization.
  • the audio playback device 200 and the audio playback device 300 perform time synchronization.
  • the audio playback device 200 may use a simple network time protocol (SNTP) or a precision time protocol (PTP), etc., to communicate with the audio playback device 300 (eg, specifically a time synchronization module) to perform time synchronization.
  • SNTP simple network time protocol
  • PTP precision time protocol
  • the audio playback device 200 sends the audio fragment to the audio playback device 300.
  • the obtained audio fragment is sent to its own cache module (ie, step S404d), and on the other hand, the obtained audio fragment is sent to The audio playback device 300 (eg, the cache module of the audio playback device 300 ).
  • the audio playback device 300 caches the audio fragment.
  • the buffering module of the audio playback device 300 buffers the received audio fragments.
  • the audio playback device 200 notifies the audio playback device 300 to start audio playback.
  • the distribution module of the audio playback device 200 determines that the current time is later than or equal to the expected playback time point of the first audio segment, on the one hand, it notifies its own playback module to start audio playback (ie, step S405a), and on the other hand
  • the audio playback device 300 eg, a playback module of the audio playback device 300 ) is notified to start audio playback.
  • the audio playback device 300 starts audio playback.
  • the audio playback device 300 reads the first audio segment and subsequent audio segments from its own cache module, and starts playing.
  • the audio fragment received by the audio playback device 300 carries the expected playback time point, and the expected playback time point is the playback progress difference between the audio playback device 100 and the audio playback device 200 according to the audio source playback device 200 . Periodically updated.
  • the audio playback device 300 when the audio playback device 300 plays the audio segment, it will write the audio segment into the audio output driver (for example, ALSA), and call the audio output
  • the driver's interface reads the speaker output time point currently written to the audio segment.
  • the speaker output time point may be considered to be the time point when the audio playback device 300 actually plays the currently written audio segment, which is simply referred to as the actual playback time point of the currently written audio segment.
  • the playback speed of the audio playback device 300 is the same as that of the audio source device 100.
  • the playback speed (or delivery speed) remains the same.
  • FIG. 9 shows the flow of another audio playback method provided by the present application. As shown in FIG. 9 , the method includes steps S401 to S404, steps S405a, S406, S601 to S611, steps S801 to S804, and steps 901 to S911.
  • the difference between the process in this embodiment and the process in FIG. 8 is that the audio playback device 200 records the difference between the actual playback time point of each audio fragment and the expected playback time point carried in the audio fragment, and according to the difference
  • the playback speed of the audio playback device 200 is adjusted so that the playback speed of the audio playback device 200 is consistent with the playback speed of the audio source device 100 . That is, the audio playback device 200 executes steps S601 to S611, and the specific content can refer to the related content of the flow in FIG. 6 .
  • the audio playback device 300 also uses a similar method to adjust its own playback speed, so that the playback speed of the audio playback device 300 is also consistent with the playback speed of the audio source device 100 . That is, the audio playback device 300 executes steps S901 to S911. In some examples, the audio playback device 300 further includes a playback speed module.
  • the playback module of the audio playback device 300 acquires the content of the audio segment from the cache module.
  • the playback module writes the acquired audio segment content to the audio output driver.
  • the playback module invokes the interface of the audio output driver to query the speaker expected output time of the currently written audio segment, that is, the actual playback time point of the currently written audio segment.
  • the playback module After the playback module writes the audio segment to the audio output driver, the playback module also calls the interface of the audio output driver to query the speaker expected output time of the currently written audio segment, and the speaker expected output time of the audio segment can be considered to be the The actual playback time point of the audio segment.
  • the playback module calculates the difference between the actual playback time point of the audio fragment and the expected playback time point (eg, playtime) carried in the audio fragment, and determines whether the difference is greater than the preset threshold A.
  • the playback module notifies the cache module to delete or add audio segments.
  • the playback module sends the actual playback time point of the audio segment to the playback speed module and the difference value corresponding to it.
  • the playback speed module stores the difference between the actual playback time point of the audio segment and its corresponding difference.
  • the playback speed module calculates the deviation of the playback speed according to the difference between the actual playback time point of the audio segment and its corresponding value.
  • the playback speed module calculates the target playback speed according to the deviation of the playback speed.
  • the playback speed module sends the target playback speed to the playback module.
  • the playback module modifies the playback speed value of the audio output driver to the target playback speed.
  • steps S901 to S911 may refer to the related contents of steps S601 to S611 in FIG. 6 , which will not be repeated here.
  • target playback speed of the audio playback device 300 calculated in steps S901 to S911 is the same or approximately the same as the target playback speed of the audio playback device 200 calculated in steps S601 to S611.
  • the audio playback device 200 may also use the existing technical solution to determine the expected playback time point of each audio segment, that is, without using adjustment coefficients to adjust the expected playback time point of each audio segment, Instead, the playback speed of the audio playback device 200 is adjusted directly according to the difference between the actual playback time point of each audio segment and the expected playback time point until it is consistent with the playback speed (or delivery speed) of the audio source device.
  • the audio playback device 300 can also directly determine the actual playback time point of each audio segment according to its own audio output driver, calculate the difference between the actual playback time point and the expected playback time point of each audio segment, and adjust the audio playback time. The playback speed of the device 300 until it is consistent with the playback speed (or delivery speed) of the audio source device.
  • the audio playback device 200 may be a master speaker or a master headset, and the audio playback device 300 may be a slave speaker or a slave headset.
  • the audio playback device 200 may be a master speaker, and the audio playback device 300 may be a slave earphone.
  • the audio playback device 200 may be a master earphone, and the audio playback device 300 may be a slave speaker.
  • the audio playback device is not limited to special audio playback devices such as speakers and earphones, and may also be a composite device such as a mobile device with speakers.
  • the embodiments of the present application also provide a chip system.
  • the chip system includes at least one processor 2101 and at least one interface circuit 1102 .
  • the processor 2101 and the interface circuit 1102 may be interconnected by wires.
  • the interface circuit 1102 may be used to receive signals from other devices (eg, the memory of the audio playback device 200).
  • the interface circuit 1102 may be used to send signals to other devices (eg, the processor 2101).
  • the interface circuit 1102 may read the instructions stored in the memory and send the instructions to the processor 2101 .
  • the electronic device can be made to execute various steps executed by the audio playback device 200 (eg, a sound box) in the above-mentioned embodiment.
  • the audio playback device 200 eg, a sound box
  • the chip system may also include other discrete devices, which are not specifically limited in this embodiment of the present application.
  • the above-mentioned terminal and the like include corresponding hardware structures and/or software modules for executing each function.
  • the embodiments of the present application can be implemented in hardware or a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of the embodiments of the present invention.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. It should be noted that, the division of modules in the embodiment of the present invention is schematic, and is only a logical function division, and there may be other division manners in actual implementation.
  • Each functional unit in each of the embodiments of the embodiments of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium.
  • a computer-readable storage medium includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: flash memory, removable hard disk, read-only memory, random access memory, magnetic disk or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

An audio playback method, device and system, relating to the technical field of audios. The audio playback device can be adjusted to be consistent with the playing progress of an audio source, thereby improving the listening experience of users. The audio playback method comprises: after receiving audio data sent by an audio source device, the audio playback device segments the audio data into a plurality of audio segments, and adjusts an expected playback time point of subsequent audio segments according to the change trend of the number of audio segments in a cache region of the audio playback device, so as to maintain a consistent playback progress (or delivery progress) with the audio source device; or, the audio playback device can adjust the playback speed thereof according to a deviation between an actual playback time point of each audio segment and the expected playback time point, to be consistent with the playback speed (or delivery speed) of the audio source device, so as to maintain a consistent playback progress (or delivery progress) with the audio source device.

Description

一种音频播放方法、设备及系统An audio playback method, device and system
本申请要求于2021年2月27日提交国家知识产权局、申请号为202110221879.2、申请名称为“一种音频播放方法、设备及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202110221879.2 and the application title "An audio playback method, device and system" filed with the State Intellectual Property Office on February 27, 2021, the entire contents of which are incorporated by reference in in this application.
技术领域technical field
本申请涉及音频技术领域,尤其涉及一种音频播放方法、设备及系统。The present application relates to the field of audio technology, and in particular, to an audio playback method, device, and system.
背景技术Background technique
智能设备可以通过蓝牙、Wi-Fi等无线通信方式连接到音频播放设备(如耳机、音箱等),并通过音频播放设备播放音频内容。同时,智能设备播放视频画面。由于智能设备和音频播放设备的晶振差异,造成两个设备播放速度不同,可能会出现智能设备播放的视频画面,与音频播放设备播放的音频内容出现不同步的情况,尤其在经过较长时间后更为明显,导致用户体验较差。Smart devices can be connected to audio playback devices (such as headphones, speakers, etc.) through wireless communication methods such as Bluetooth and Wi-Fi, and play audio content through the audio playback devices. At the same time, the smart device plays the video screen. Due to the difference between the crystal oscillator of the smart device and the audio playback device, the playback speed of the two devices is different, and the video screen played by the smart device may appear out of sync with the audio content played by the audio playback device, especially after a long period of time. More obvious, resulting in poor user experience.
发明内容SUMMARY OF THE INVENTION
为了解决上述的技术问题,本申请提供了一种音频播放方法、设备及系统。本申请提供的技术方案,能够使得音频播放设备与音频源设备(即智能设备)的播放进度保持一致,提升用户体验,尤其是收听体验。In order to solve the above technical problems, the present application provides an audio playback method, device and system. The technical solution provided by the present application can make the playback progress of the audio playback device and the audio source device (ie, the smart device) consistent, and improve the user experience, especially the listening experience.
第一方面,提供一种音频播放方法,应用于第一音频播放设备,第一音频播放设备与音频源设备无线通信。该方法包括:接收到音频源设备发送的音频数据;将音频数据划分为N个音频分片;缓存N个音频分片;其中,根据第一调整系数得到每个音频分片的预计播放时间点;依次播放每个音频分片;周期性地采集缓存的音频分片的当前数量以及当前数量所对应的采集时间点;在周期性地采集的时长达到预设时长后,或者,在周期性地采集的次数达到预设次数后,根据每次采集的当前数量、以及每次采集的当前数量所对应的采集时间点,得到第二调整系数;根据第二调整系数得到后续每个音频分片的预计播放时间点;依次播放后续的音频分片;其中,N为大于等于2的正整数;第一调整系数为预设系数。In a first aspect, an audio playback method is provided, which is applied to a first audio playback device, and the first audio playback device communicates wirelessly with an audio source device. The method includes: receiving audio data sent by an audio source device; dividing the audio data into N audio segments; buffering the N audio segments; wherein, according to a first adjustment coefficient, an expected playback time point of each audio segment is obtained ; Play each audio fragment in turn; periodically collect the current number of cached audio fragments and the collection time point corresponding to the current number; after the period of periodic collection reaches the preset period, or After the number of collections reaches the preset number of times, the second adjustment coefficient is obtained according to the current quantity collected each time and the collection time point corresponding to the current quantity collected each time; the second adjustment coefficient is obtained according to the second adjustment coefficient. Estimated playback time point; play subsequent audio segments in sequence; wherein, N is a positive integer greater than or equal to 2; the first adjustment coefficient is a preset coefficient.
可以理解的是,第一音频播放设备缓存区数据量的变化趋势体现出音频源设备和第一音频播放设备的播放速度的偏差。因此,可以根据该缓存区数据量的变化趋势,调整音频数据中各个音频分片的预计开始播放的时间点(简称为预计播放时间点),达到同步第一音频播放设备和音频源设备的播放速度的效果。进而可以避免造成第一音频播放设备缓存区的数据溢出或者耗尽,进而避免出现播放音频时声音卡顿或爆音的情况,提升外放音频的收听体验。It can be understood that the change trend of the data amount of the buffer area of the first audio playback device reflects the deviation of the playback speed of the audio source device and the first audio playback device. Therefore, according to the change trend of the data volume of the buffer area, the time point at which each audio fragment in the audio data is expected to be played (referred to as the expected playback time point) can be adjusted to synchronize the playback of the first audio playback device and the audio source device. effect of speed. In this way, data overflow or exhaustion in the buffer area of the first audio playback device can be avoided, so as to avoid the situation that the sound is stuck or popped when the audio is played, and the listening experience of the externally played audio can be improved.
在一种可能的实现方式中,在接收到音频源设备发送的音频数据之前,该方法还包括:接收到音频源设备发送的播放音频的指示。In a possible implementation manner, before receiving the audio data sent by the audio source device, the method further includes: receiving an instruction to play audio sent by the audio source device.
在一种可能的实现方式中,根据每次采集的当前数量、以及每次采集的当前数量所对应的采集时间点,得到第二调整系数;包括:对每次采集的当前数量、每次采集 的当前数量所对应的采集时间点进行线性拟合,得到第一斜率;根据第一斜率得到第二调整系数。In a possible implementation manner, the second adjustment coefficient is obtained according to the current quantity collected each time and the collection time point corresponding to the current quantity collected each time; Perform linear fitting at the acquisition time point corresponding to the current number of , to obtain the first slope; and obtain the second adjustment coefficient according to the first slope.
示例性的,以时间为X轴,第一音频播放设备缓存的第一音频分片的数量为Y轴,在二维平面上绘制离散点。换言之,绘制的各个离散点用于表征在相应采集时间点采集到的第一音频分片的数量。然后,通过线性回归方式对离散点进行线性回归后得到一条直线,该直线的斜率(即第一斜率)表示着缓存的第一音频分片的增加或减少的变化趋势,即每单位时间内增加或减少第一音频分片的数量。第一斜率为正值时,表示每单位时间内增加的第一音频分片的数量,也意味着音频源设备的播放速度(或投放速度)快于第一音频播放设备的播放速度。第一斜率为负值时,表示每单位时间内减少的第一音频分片的数量,也意味着音频源设备的播放速度(或投放速度)慢于第一音频播放设备的播放速度。通过第一斜率计算调整系数,可以调整后续第二音频分片的播放尽快,以快速与音频源设备的播放速度或投放速度保持一致。Exemplarily, with time as the X-axis, and the number of the first audio segments buffered by the first audio playback device as the Y-axis, discrete points are drawn on a two-dimensional plane. In other words, each discrete point drawn is used to represent the number of the first audio segments collected at the corresponding collection time point. Then, a linear regression is performed on the discrete points to obtain a straight line, and the slope of the straight line (ie, the first slope) represents the change trend of the increase or decrease of the buffered first audio segment, that is, the increase per unit time. Or reduce the number of first audio slices. When the first slope is a positive value, it indicates the number of first audio segments added per unit time, which also means that the playback speed (or delivery speed) of the audio source device is faster than the playback speed of the first audio playback device. When the first slope is a negative value, it means that the number of first audio segments reduced per unit time, which also means that the playback speed (or delivery speed) of the audio source device is slower than the playback speed of the first audio playback device. By calculating the adjustment coefficient through the first slope, it is possible to adjust the playback of the subsequent second audio segment as quickly as possible so as to quickly keep the playback speed or delivery speed of the audio source device.
在一种可能的实现方式中,周期性地采集缓存的音频分片的当前数量以及当前数量所对应的采集时间点;包括:当任意一个音频分片的实际播放时间点和预计播放时间点两者的差值的绝对值大于第一阈值时,第一音频播放设备开始周期性地采集缓存的音频分片的当前数量以及当前数量所对应的采集时间点;其中,音频分片的实际播放时间点为音频分片的扬声器预期输出时间点;音频分片的扬声器预期输出时间点是第一音频播放设备调用第一音频播放设备的音频输出驱动的接口查询得到。In a possible implementation, periodically collecting the current number of buffered audio segments and the collection time point corresponding to the current number; including: when the actual playback time point and the expected playback time point of any audio segment are two When the absolute value of the difference is greater than the first threshold, the first audio playback device starts to periodically collect the current number of buffered audio fragments and the collection time point corresponding to the current number; wherein, the actual playback time of the audio fragment The point is the expected output time point of the speaker of the audio fragment; the expected output time point of the speaker of the audio fragment is obtained by the first audio playback device calling the interface of the audio output driver of the first audio playback device to query.
由此,提供开始采集缓存的第一音频分片数量的时机。Thus, an opportunity to start collecting the number of buffered first audio segments is provided.
第二方面,提供一种音频播放方法,应用于第一音频播放设备,第一音频播放设备与音频源设备无线通信。该方法包括:接收到音频源设备发送的音频数据;将音频数据划分为N个音频分片;缓存N个音频分片;其中,根据第一调整系数得到每个音频分片的预计播放时间点;依次播放每个音频分片;在音频分片的实际播放时间点与预计播放时间点的差值的绝对值大于预设阈值后,调整缓存的音频分片的数量;其中,音频分片的实际播放时间点为音频分片的扬声器预期输出时间点;音频分片的扬声器预期输出时间点通过第一音频播放设备调用第一音频播放设备的音频输出驱动的接口查询得到;N为大于等于2的正整数;第一调整系数为预设系数。In a second aspect, an audio playback method is provided, which is applied to a first audio playback device, and the first audio playback device communicates wirelessly with an audio source device. The method includes: receiving audio data sent by an audio source device; dividing the audio data into N audio segments; buffering the N audio segments; wherein, according to a first adjustment coefficient, an expected playback time point of each audio segment is obtained ; Play each audio segment in turn; after the absolute value of the difference between the actual playback time point of the audio segment and the expected playback time point is greater than the preset threshold, adjust the number of cached audio segments; The actual playback time point is the expected output time point of the speaker of the audio fragment; the expected output time point of the speaker of the audio fragment is obtained by the first audio playback device calling the interface of the audio output driver of the first audio playback device; N is greater than or equal to 2 is a positive integer; the first adjustment coefficient is a preset coefficient.
由此,提供了一种调整第一音频播放设备的播放速度的方法,可以与音频源的播放速度或投放速度一致,有利于长时间保持与音频源的播放进度或投放进度一致。Thus, a method for adjusting the playback speed of the first audio playback device is provided, which can be consistent with the playback speed or delivery speed of the audio source, which is beneficial to keep the playback progress or delivery schedule consistent with the audio source for a long time.
在一种可能的实现方式中,在音频分片的实际播放时间点与预计播放时间点的差值的绝对值大于预设阈值后,调整缓存的音频分片的数量;包括:在音频分片的实际播放时间点与预计播放时间点的差值的绝对值大于预设阈值,且差值为负值后,增加第一数量的音频分片;在音频分片的实际播放时间点与预计播放时间点的差值的绝对值大于预设阈值,且差值为正值后,删除第一数量的音频分片。In a possible implementation manner, after the absolute value of the difference between the actual playback time point of the audio segment and the expected playback time point is greater than a preset threshold, adjust the number of buffered audio segments; including: in the audio segment If the absolute value of the difference between the actual playback time point and the expected playback time point is greater than the preset threshold, and the difference is a negative value, add the first number of audio fragments; between the actual playback time point of the audio fragment and the expected playback time point After the absolute value of the difference between the time points is greater than the preset threshold, and the difference is a positive value, the first number of audio segments are deleted.
在一种可能的实现方式中,第一数量关联于差值的绝对值除以音频分片播放时长的商。In a possible implementation manner, the first quantity is related to the quotient of the absolute value of the difference divided by the playing duration of the audio segment.
在一种可能的实现方式中,第一数量为差值的绝对值除以音频分片播放时长的商。In a possible implementation manner, the first quantity is the quotient of the absolute value of the difference divided by the playing duration of the audio segment.
在一种可能的实现方式中,增加的第一数量的音频分片为静音数据。In a possible implementation manner, the added first number of audio segments are mute data.
在一种可能的实现方式中,在音频分片的实际播放时间点与预计播放时间点的差 值的绝对值小于或等于预设阈值后,调整第一音频播放设备的播放速度。In a possible implementation, after the absolute value of the difference between the actual playback time point of the audio fragment and the expected playback time point is less than or equal to a preset threshold, the playback speed of the first audio playback device is adjusted.
在一种可能的实现方式中,在音频分片的实际播放时间点与预计播放时间点的差值的绝对值小于或等于预设阈值后,采集音频分片的实际播放时间点,以及音频分片的实际播放时间点与预计播放时间点的差值;在采集的次数达到预设次数后,或者,在采集的时长达到预设时长后,对每次采集的实际播放时间点,每次采集的实际播放时间点对应的差值进行线性拟合,得到第二斜率;获取第一音频播放设备的当前播放速度;根据当前播放速度和第二斜率,得到调整后的播放速度;以调整后的播放速度,依次播放后续的音频分片。In a possible implementation manner, after the absolute value of the difference between the actual playback time point of the audio segment and the expected playback time point is less than or equal to a preset threshold, collect the actual playback time point of the audio segment, and the audio segment The difference between the actual playback time point and the expected playback time point of the film; after the number of collections reaches the preset number of times, or after the collection time reaches the preset time, for the actual playback time point of each collection, each collection Perform linear fitting on the difference corresponding to the actual playback time point of , to obtain the second slope; obtain the current playback speed of the first audio playback device; obtain the adjusted playback speed according to the current playback speed and the second slope; Playback speed, play subsequent audio segments in sequence.
由此,提供了一种计算第一音频播放设备与音频源设备的速度偏差的具体方法。Thus, a specific method for calculating the speed deviation between the first audio playback device and the audio source device is provided.
在一种可能的实现方式中,第一音频播放设备连接有第二音频播放设备,该方法还包括:向第二音频播放设备发送N个音频分片。In a possible implementation manner, the first audio playback device is connected to a second audio playback device, and the method further includes: sending N audio segments to the second audio playback device.
也就是说,第一音频播放设备可以和第二音频播放设备一同播放音频,在实现第一音频播放设备和音频源设备的播放同步时,也实现了第二音频播放设备与音频源设备的播放同步。进一步的,第二音频播放设备也可以采用与第一音频播放设备相同的调整播放速度的方法,调整第二音频播放设备的播放速度。That is to say, the first audio playback device can play audio together with the second audio playback device. When the playback synchronization between the first audio playback device and the audio source device is realized, the playback of the second audio playback device and the audio source device is also realized. Synchronize. Further, the second audio playback device can also adjust the playback speed of the second audio playback device by using the same method for adjusting the playback speed as the first audio playback device.
在一种可能的实现方式中,在第一音频播放设备播放第一个音频分片之前,该方法还包括:向第二音频播放设备发送开始播放音频分片的指示。In a possible implementation manner, before the first audio playback device plays the first audio segment, the method further includes: sending an instruction to start playing the audio segment to the second audio playback device.
第三方面,提供一种第一音频播放设备。第一音频播放设备包括处理器、音频输出装置,以及存储器,音频输出装置和存储器都与处理器耦合,存储器用于存储计算机程序,当计算机程序被处理器执行是,使得第一音频播放设备执行上述第一方面以及第一方面中任一种可能的实现方式中的方法,或者执行上述第二方面以及第二方面中任一种可能的实现方式中的方法。In a third aspect, a first audio playback device is provided. The first audio playback device includes a processor, an audio output device, and a memory, the audio output device and the memory are both coupled to the processor, and the memory is used for storing a computer program, and when the computer program is executed by the processor, the first audio playback device is executed. The first aspect and the method in any possible implementation manner of the first aspect, or the second aspect and the method in any possible implementation manner of the second aspect.
第四方面,提供一种装置。该装置包含在第一音频播放设备中,该装置具有实现上述方面及上面方面中可能的实现方式中任一方法中第一音频播放设备行为的功能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括至少一个与上述功能相对应的模块或单元。例如,通信模块或单元、处理模块或单元、以及播放模块或单元等。In a fourth aspect, an apparatus is provided. The apparatus is included in a first audio playback device, and the apparatus has the function of implementing the behavior of the first audio playback device in any of the above-mentioned aspects and possible implementation manners of the above-mentioned aspects. This function can be implemented by hardware or by executing corresponding software by hardware. The hardware or software includes at least one module or unit corresponding to the above-mentioned functions. For example, a communication module or unit, a processing module or unit, and a playback module or unit, etc.
第五方面,提供一种计算机可读存储介质。该计算机可读存储介质包括计算机程序,当计算机程序在第一音频播放设备上运行时,使得所述第一音频播放设备执行上述第一方面以及第一方面中任一种可能的实现方式中的方法,或者执行上述第二方面以及第二方面中任一种可能的实现方式中的方法。In a fifth aspect, a computer-readable storage medium is provided. The computer-readable storage medium includes a computer program that, when the computer program runs on the first audio playback device, causes the first audio playback device to perform the above-mentioned first aspect and any possible implementation of the first aspect. method, or perform the method in the second aspect and any possible implementation manner of the second aspect.
第六方面,提供一种计算机程序产品。当计算机程序产品在计算机上运行时,使得计算机执行上述第一方面以及第一方面中任一种可能的实现方式中的方法,或者执行上述第二方面以及第二方面中任一种可能的实现方式中的方法。In a sixth aspect, a computer program product is provided. When the computer program product runs on the computer, it causes the computer to execute the method in the first aspect and any possible implementation of the first aspect, or execute the second aspect and any possible implementation of the second aspect. method in method.
第七方面,提供一种芯片系统。该芯片系统包括处理器,当处理器执行指令时,处理器执行上述第一方面以及第一方面中任一种可能的实现方式中的方法,或者执行上述第二方面以及第二方面中任一种可能的实现方式中的方法。In a seventh aspect, a chip system is provided. The chip system includes a processor, and when the processor executes an instruction, the processor executes the first aspect and the method in any possible implementation manner of the first aspect, or executes any of the second aspect and the second aspect. method in one possible implementation.
第八方面,提供一种系统。该系统包括音频源播放设备和第一音频播放设备,所述第一音频播放设备执行上述第一方面以及第一方面中任一种可能的实现方式中的方 法,或者执行上述第二方面以及第二方面中任一种可能的实现方式中的方法。In an eighth aspect, a system is provided. The system includes an audio source playback device and a first audio playback device, where the first audio playback device executes the first aspect and the method in any possible implementation manner of the first aspect, or executes the second aspect and the first aspect. The method in any possible implementation manner of the two aspects.
在一种可能的实现方式中,该系统还包括第二音频播放设备,所述第二音频播放设备执行上述第二方面以及第二方面中任一种可能的实现方式中的方法。In a possible implementation manner, the system further includes a second audio playback device, and the second audio playback device executes the method in the second aspect and any possible implementation manner of the second aspect.
可以理解地,上述提供的第三方面所述的第一音频播放设备,第四方面所述的装置,第五方面所述的计算机存储介质,第六方面所述的计算机程序产品,第七方面所述的芯片系统,以及第八方面所述的系统所能达到的有益效果,可参见第一方面或第二方面及其任意一种可能的设计方式中的有益效果,此处不再赘述。It can be understood that the first audio playback device described in the third aspect, the apparatus described in the fourth aspect, the computer storage medium described in the fifth aspect, the computer program product described in the sixth aspect, and the seventh aspect For the described chip system and the beneficial effects that can be achieved by the system described in the eighth aspect, reference may be made to the beneficial effects in the first aspect or the second aspect and any possible design manner thereof, and details are not repeated here.
附图说明Description of drawings
图1为本申请实施例提供的音频播放方法的场景示意图;1 is a schematic diagram of a scene of an audio playback method provided by an embodiment of the present application;
图2为本申请实施例提供的音频源设备的结构示意图;2 is a schematic structural diagram of an audio source device provided by an embodiment of the present application;
图3为本申请实施例提供的音频播放设备的结构示意图;3 is a schematic structural diagram of an audio playback device provided by an embodiment of the present application;
图4为本申请实施例提供的音频播放方法的流程示意图;4 is a schematic flowchart of an audio playback method provided by an embodiment of the present application;
图5为本申请实施例提供的音频播放设备缓存音频分片的数量变化趋势的拟合方法的示意图;FIG. 5 is a schematic diagram of a fitting method for a trend in the number of audio fragments buffered by an audio playback device according to an embodiment of the present application;
图6为本申请实施例提供的音频播放方法的流程示意图;6 is a schematic flowchart of an audio playback method provided by an embodiment of the present application;
图7为本申请实施例提供的音频播放设备实际播放时间点与预计播放时间点的差异的变化趋势的拟合方法的示意图;7 is a schematic diagram of a fitting method of a variation trend of the difference between an actual playback time point and an expected playback time point of an audio playback device provided by an embodiment of the present application;
图8为本申请实施例提供的音频播放方法的流程示意图;8 is a schematic flowchart of an audio playback method provided by an embodiment of the present application;
图9为本申请实施例提供的音频播放方法的流程示意图;9 is a schematic flowchart of an audio playback method provided by an embodiment of the present application;
图10为本申请实施例提供的芯片系统的结构示意图。FIG. 10 is a schematic structural diagram of a chip system provided by an embodiment of the present application.
具体实施方式Detailed ways
在本申请实施例的描述中,除非另有说明,“/”表示或的意思。例如,A/B可以表示A或B。本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系。例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。In the description of the embodiments of the present application, unless otherwise specified, "/" means or. For example, A/B can mean A or B. "And/or" in this document is only an association relationship to describe the associated objects, indicating that three kinds of relationships can exist. For example, A and/or B can mean that A exists alone, A and B exist at the same time, and B exists alone.
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本申请实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。Hereinafter, the terms "first" and "second" are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Thus, a feature defined as "first" or "second" may expressly or implicitly include one or more of that feature. In the description of the embodiments of the present application, unless otherwise specified, "plurality" means two or more.
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。In the embodiments of the present application, words such as "exemplary" or "for example" are used to represent examples, illustrations or illustrations. Any embodiments or designs described in the embodiments of the present application as "exemplary" or "such as" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present the related concepts in a specific manner.
图1为本申请实施例提供的音频播放方法的场景示意图。图1示出了本申请实施例提供的一种通信系统。该通信系统包括音频源设备100和音频播放设备200。可选的,通信系统还可以包括音频播放设备300。FIG. 1 is a schematic diagram of a scenario of an audio playback method provided by an embodiment of the present application. FIG. 1 shows a communication system provided by an embodiment of the present application. The communication system includes an audio source device 100 and an audio playback device 200 . Optionally, the communication system may further include an audio playback device 300 .
其中,音频源设备100用于向音频播放设备200提供音频内容。示例性的,本申请实施例中音频源设备100例如可以为手机、平板电脑、个人计算机(personal computer,PC)、个人数字助理(personal digital assistant,PDA)、上网本、可穿戴设备、增强 现实技术(augmented reality,AR)设备、虚拟现实(virtual reality,VR)设备、车载设备、智慧屏等,本申请对该音频源设备100的具体形式不做特殊限制。The audio source device 100 is configured to provide audio content to the audio playback device 200 . Exemplarily, the audio source device 100 in this embodiment of the present application may be, for example, a mobile phone, a tablet computer, a personal computer (PC), a personal digital assistant (PDA), a netbook, a wearable device, or an augmented reality technology. (augmented reality, AR) device, virtual reality (virtual reality, VR) device, in-vehicle device, smart screen, etc. The specific form of the audio source device 100 is not particularly limited in this application.
音频播放设备200,用于接收音频源设备100发送来的音频内容,并播放该音频内容。示例性的,音频播放设备200例如可以为无线耳机、无线音箱、可穿戴设备、AR设备、VR设备等,本申请对该音频播放设备200的具体形式不做特殊限制。The audio playback device 200 is configured to receive the audio content sent by the audio source device 100 and play the audio content. Exemplarily, the audio playback device 200 may be, for example, a wireless headset, a wireless speaker, a wearable device, an AR device, a VR device, etc. The specific form of the audio playback device 200 is not particularly limited in this application.
在一种应用场景中,音频源设备100在播放视频时,可以通过自身的显示屏播放视频的画面内容,将视频中的音频内容发送给音频播放设备200;由音频播放设备200进行播放音频内容。In an application scenario, when the audio source device 100 plays a video, it can play the screen content of the video through its own display screen, and send the audio content in the video to the audio playback device 200; the audio playback device 200 plays the audio content. .
通常,为了减少无线传输时的网络抖动带来的影响,音频播放设备200会对从音频源设备100处接收的音频内容进行缓存,并延迟播放。又由于音频源设备100和音频播放设备200为不同的设备,具有硬件差异(例如晶振频率不同),会导致两个设备的播放速度不同,进而造成音频播放设备200处缓存区的数据溢出或者耗尽,出现播放音频时声音卡顿或爆音。Generally, in order to reduce the influence of network jitter during wireless transmission, the audio playback device 200 buffers the audio content received from the audio source device 100 and delays the playback. In addition, because the audio source device 100 and the audio playback device 200 are different devices and have hardware differences (for example, different crystal oscillator frequencies), the playback speeds of the two devices will be different, thereby causing data overflow or consumption in the buffer area of the audio playback device 200. However, the sound stutters or pops when playing audio.
一种方案中,可以设置音频播放设备200缓存数据的最大阈值和最小阈值。当音频播放设备200缓存区的数据大于最大阈值时,按照一定比例调大音频播放设备200的播放速度。当音频200缓存区的数据小于最小阈值时,按照一定比例调小音频播放设备200的播放速度。从而,保持音频播放设备200缓存的数据量维持在预设范围内,减少音频播放设备200处缓存区的数据溢出或者耗尽的情况。In one solution, a maximum threshold and a minimum threshold for buffering data of the audio playback device 200 may be set. When the data in the buffer area of the audio playback device 200 is greater than the maximum threshold, the playback speed of the audio playback device 200 is increased according to a certain ratio. When the data in the buffer area of the audio 200 is smaller than the minimum threshold, the playback speed of the audio playback device 200 is reduced according to a certain ratio. Therefore, the amount of data buffered by the audio playback device 200 is kept within a preset range, and the situation of data overflow or exhaustion in the buffer area of the audio playback device 200 is reduced.
在该方案中,当无线网络传输速度不稳定时,音频播放设备200缓存区的数据量会不断变化,可能需要频繁调整音频播放设备200的播放速度,导致播放音频内容时快时慢,用户体验不佳。并且,调整音频播放设备200的播放速度通常按照固定的比例进行调整,与实际的播放速度不匹配,播放速度的调整精度不高。In this solution, when the wireless network transmission speed is unstable, the amount of data in the buffer area of the audio playback device 200 will change constantly, and it may be necessary to adjust the playback speed of the audio playback device 200 frequently, resulting in fast and slow playback of audio content, and user experience not good. In addition, adjusting the playback speed of the audio playback device 200 is usually adjusted according to a fixed ratio, which does not match the actual playback speed, and the adjustment accuracy of the playback speed is not high.
为此,本申请提供的技术方案,能够使得音频播放设备与音频源设备(即智能设备)的播放进度保持一致,提升用户体验,尤其是收听体验。Therefore, the technical solution provided by the present application can make the playback progress of the audio playback device and the audio source device (ie, the smart device) consistent, and improve the user experience, especially the listening experience.
示例性地,图2示出了音频源设备100的硬件结构。Exemplarily, FIG. 2 shows the hardware structure of the audio source device 100 .
音频源设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。The audio source device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2. Mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone jack 170D, sensor module 180, buttons 190, motor 191, indicator 192, camera 193, display screen 194, And a subscriber identification module (subscriber identification module, SIM) card interface 195 and so on. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.
可以理解的是,本发明实施例示意的结构并不构成对音频源设备100的具体限定。在本申请另一些实施例中,音频源设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。It can be understood that the structures illustrated in the embodiments of the present invention do not constitute a specific limitation on the audio source device 100 . In other embodiments of the present application, the audio source device 100 may include more or less components than shown, or combine some components, or separate some components, or arrange different components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理 器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。The processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (neural-network processing unit, NPU), etc. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展音频源设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。The external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the audio source device 100 . The external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example to save files like music, video etc in external memory card.
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储音频源设备100使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。处理器110通过运行存储在内部存储器121的指令,和/或存储在设置于处理器中的存储器的指令,执行音频源设备100的各种功能应用以及数据处理。Internal memory 121 may be used to store computer executable program code, which includes instructions. The internal memory 121 may include a storage program area and a storage data area. The storage program area can store an operating system, an application program required for at least one function (such as a sound playback function, an image playback function, etc.), and the like. The storage data area may store data (such as audio data, phone book, etc.) created during the use of the audio source device 100 and the like. In addition, the internal memory 121 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (UFS), and the like. The processor 110 executes various functional applications and data processing of the audio source device 100 by executing instructions stored in the internal memory 121, and/or instructions stored in a memory provided in the processor.
在一些实施例中,处理器110可以包括一个或多个接口,例如包括USB接口130,是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口130可以用于连接充电器为音频源设备100充电,也可以用于音频源设备100与外围设备之间传输数据。也可以用于连接耳机,通过耳机播放音频。该接口还可以用于连接其他电子设备,例如AR设备等。In some embodiments, the processor 110 may include one or more interfaces, such as including a USB interface 130, which is an interface conforming to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, and the like. The USB interface 130 can be used to connect a charger to charge the audio source device 100, and can also be used to transmit data between the audio source device 100 and peripheral devices. It can also be used to connect headphones to play audio through the headphones. The interface can also be used to connect other electronic devices, such as AR devices.
音频源设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。The wireless communication function of the audio source device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor, the baseband processor, and the like.
移动通信模块150可以提供应用在音频源设备100上的包括2G/3G/4G/5G等无线通信的解决方案。The mobile communication module 150 may provide a wireless communication solution including 2G/3G/4G/5G etc. applied on the audio source device 100 .
无线通信模块160可以提供应用在音频源设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。The wireless communication module 160 can provide applications on the audio source device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), bluetooth (BT), global navigation Satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions. The wireless communication module 160 may be one or more devices integrating at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 . The wireless communication module 160 can also receive the signal to be sent from the processor 110 , perform frequency modulation on it, amplify it, and convert it into electromagnetic waves for radiation through the antenna 2 .
在本申请实施例中,音频源设备100可以通过无线通信模块160,与音频播放设备200建立通信连接,将待播放的音频内容通过无线连接的方式发送给音频播放设备200,由音频播放设备200播放。该待播放的音频内容可以是视频中的声音内容,也可以是独立的音频,例如音乐等。在一些示例中,音频播放设备200还可以将音频内容再次转发给音频播放设备300,由音频播放设备200和音频播放设备300共同播放。其中,音频播放设备200为主播放设备,音频播放设备300为从播放设备,音频播放 设备300的数量为一个或多个。In the embodiment of the present application, the audio source device 100 may establish a communication connection with the audio playback device 200 through the wireless communication module 160, and send the audio content to be played to the audio playback device 200 through a wireless connection, and the audio playback device 200 play. The to-be-played audio content may be sound content in the video, or may be independent audio, such as music. In some examples, the audio playback device 200 may further forward the audio content to the audio playback device 300 again, and the audio playback device 200 and the audio playback device 300 play together the audio content. The audio playback device 200 is the master playback device, the audio playback device 300 is the slave playback device, and the number of audio playback devices 300 is one or more.
音频源设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。The audio source device 100 implements a display function through a GPU, a display screen 194, an application processor, and the like.
音频源设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。The audio source device 100 may implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, an application processor, and the like. Such as music playback, recording, etc.
示例性地,图3示出了音频播放设备200的硬件结构。Exemplarily, FIG. 3 shows the hardware structure of the audio playback device 200 .
音频播放设备200可以包括处理器210,存储器220,无线通信模块230,天线240,扬声器250,电源模块260等。可以理解的是,本发明实施例示意的结构并不构成对音频播放设备200的具体限定。在本申请另一些实施例中,音频播放设备200可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。The audio playback device 200 may include a processor 210, a memory 220, a wireless communication module 230, an antenna 240, a speaker 250, a power module 260, and the like. It can be understood that the structures illustrated in the embodiments of the present invention do not constitute a specific limitation on the audio playback device 200 . In other embodiments of the present application, the audio playback device 200 may include more or less components than shown, or combine some components, or separate some components, or arrange different components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
处理器210可以包括一个或多个处理单元,例如:处理器210可以包括分发模块、播放模块、缓存模块、调整系数计算模块、以及播速模块等。可选的,处理器210还可以包括时钟同步模块等。下文将结合具体实施例对各个模块的具体作用进行详细说明。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。The processor 210 may include one or more processing units. For example, the processor 210 may include a distribution module, a playback module, a cache module, an adjustment coefficient calculation module, a broadcast speed module, and the like. Optionally, the processor 210 may further include a clock synchronization module and the like. The specific functions of each module will be described in detail below with reference to specific embodiments. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
存储器220可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。一些示例中,存储器220可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。处理器210通过运行存储在存储器220的指令,和/或存储在设置于处理器中的存储器的指令,执行音频播放设备200的各种功能应用以及数据处理。Memory 220 may be used to store computer-executable program code, which includes instructions. In some examples, memory 220 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, universal flash storage (UFS), and the like. The processor 210 executes various functional applications and data processing of the audio playback device 200 by executing the instructions stored in the memory 220 and/or the instructions stored in the memory provided in the processor.
音频播放设备200的无线通信功能可以通过天线240,无线通信模块230,处理器210中的调制解调处理器以及基带处理器等实现。The wireless communication function of the audio playback device 200 may be implemented by the antenna 240 , the wireless communication module 230 , the modem processor in the processor 210 , the baseband processor, and the like.
无线通信模块230可以提供应用在音频播放设备200上的包括WLAN(如Wi-Fi网络),BT,GNSS,FM,NFC,IR等无线通信的解决方案。无线通信模块230可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块230经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器210。无线通信模块230还可以从处理器210接收待发送的信号,对其进行调频,放大,经天线240转为电磁波辐射出去。The wireless communication module 230 can provide a wireless communication solution including WLAN (eg Wi-Fi network), BT, GNSS, FM, NFC, IR, etc. applied on the audio playback device 200 . The wireless communication module 230 may be one or more devices integrating at least one communication processing module. The wireless communication module 230 receives the electromagnetic wave via the antenna 2 , modulates and filters the electromagnetic wave signal, and sends the processed signal to the processor 210 . The wireless communication module 230 can also receive the signal to be sent from the processor 210 , perform frequency modulation on it, amplify it, and convert it into electromagnetic waves for radiation through the antenna 240 .
在本申请实施例中,音频播放设备200可以通过无线通信模块230,与音频源设备100建立通信连接,通过无线连接接收音频源设备100发送的音频内容,并通过扬声器250播放。在一些示例中,音频播放设备200还可以通过无线通信模块230,与其他的音频播放设备300建立通信连接。那么音频播放设备200可以将音频内容再次转发给音频播放设备300,由音频播放设备200和音频播放设备300共同播放,播放出具有立体声效的声音。其中,音频播放设备200为主播放设备,音频播放设备300为从播放设备,音频播放设备300的数量为一个或多个。In this embodiment of the present application, the audio playback device 200 may establish a communication connection with the audio source device 100 through the wireless communication module 230 , receive audio content sent by the audio source device 100 through the wireless connection, and play through the speaker 250 . In some examples, the audio playback device 200 may also establish a communication connection with other audio playback devices 300 through the wireless communication module 230 . Then, the audio playback device 200 can forward the audio content to the audio playback device 300 again, and the audio playback device 200 and the audio playback device 300 play together the audio content to play a sound with stereo effect. The audio playback device 200 is a master playback device, the audio playback device 300 is a slave playback device, and the number of audio playback devices 300 is one or more.
电源模块260为音频播放设备200的各个部件提供供电,例如为处理器210、存储器220、无线通信模块230等供电。The power supply module 260 provides power for various components of the audio playback device 200, such as power supply for the processor 210, the memory 220, the wireless communication module 230, and the like.
需要说明的是,音频播放设备300的结构可以参考音频播放设备200,当然音频播放设备300的结构可以与音频播放设备200的结构相同,也可以不同,本申请对此不做限定。It should be noted that the structure of the audio playback device 300 may refer to the audio playback device 200. Of course, the structure of the audio playback device 300 may be the same as or different from that of the audio playback device 200, which is not limited in this application.
本申请实施例提供的技术方案可适用于上述图1所示的通信系统,且音频源设备100具有图2所示的结构,音频播放设备200具有图3所示的结构。The technical solutions provided in the embodiments of the present application are applicable to the communication system shown in FIG. 1 , and the audio source device 100 has the structure shown in FIG. 2 , and the audio playback device 200 has the structure shown in FIG. 3 .
在本申请提供的一种技术方案中,音频播放设备200在多个邻近的时间点(或时刻)采集缓存区的数据量的大小,通过线性拟合的方式计算出缓存区数据量的变化趋势。其中,缓存区数据量的变化趋势体现出音频源设备100和音频播放设备200的播放速度的偏差。因此,可以根据该缓存区数据量的变化趋势,调整音频内容中各个音频分片的预计开始播放的时间点(简称为预计播放时间点),达到同步音频播放设备200和音频源设备100的播放速度的效果,提升用户的听觉体验。In a technical solution provided by the present application, the audio playback device 200 collects the size of the data volume of the buffer area at multiple adjacent time points (or moments), and calculates the change trend of the buffer area data volume by means of linear fitting . The change trend of the data amount of the buffer area reflects the deviation of the playback speed of the audio source device 100 and the audio playback device 200 . Therefore, according to the change trend of the data volume of the buffer area, the time point at which each audio fragment in the audio content is expected to be played (referred to as the expected playback time point) can be adjusted to synchronize the playback of the audio playback device 200 and the audio source device 100. The effect of speed improves the user's listening experience.
在本申请提供的又一种技术方案中,由于音频播放设备200和音频源设备100的播放速度不同,还可以调整音频播放设备200的播放速度,使得音频播放设备200的播放速度与音频源设备100的播放速度保持一致。具体的,音频播放设备200可以采集音频分片中的预计播放时间点,与对应的音频播放设备200实际开始播放该音频分片的时间点(简称为音频分片的实际播放时间点)的偏差,然后通过线性拟合的方式计算出偏差的变化趋势;并利用该变化趋势去调整音频播放设备200的播放速度,使得音频播放设备200的播放速度与音频源设备100的播放速度一致。In yet another technical solution provided by the present application, since the playback speeds of the audio playback device 200 and the audio source device 100 are different, the playback speed of the audio playback device 200 can also be adjusted so that the playback speed of the audio playback device 200 is the same as that of the audio source device. The playback speed of 100 remains the same. Specifically, the audio playback device 200 may collect the expected playback time point in the audio segment, and the deviation from the corresponding time point when the audio playback device 200 actually starts playing the audio segment (referred to as the actual playback time point of the audio segment). , and then calculate the variation trend of the deviation by linear fitting; and use the variation trend to adjust the playback speed of the audio playback device 200 so that the playback speed of the audio playback device 200 is consistent with the playback speed of the audio source device 100 .
在另一种应用场景中,音频源设备100在播放视频时,可以通过自身的显示屏播放视频的画面内容,将视频中的音频内容发送给音频播放设备200和音频播放设备300,由音频播放设备200和音频播放设备300共同播放音频内容。其中,音频播放设备200为主播放设备,音频播放设备300为从播放设备。In another application scenario, when the audio source device 100 plays a video, it can play the screen content of the video through its own display screen, and send the audio content in the video to the audio playback device 200 and the audio playback device 300. The device 200 and the audio playback device 300 jointly play audio content. The audio playback device 200 is the master playback device, and the audio playback device 300 is the slave playback device.
其中,音频源设备100和音频播放设备200之间的同步播放,可以参考前一个应用场景中的描述,此处不再赘述。通常音频播放设备200和音频播放设备300为同一厂家的设备,可以执行时钟同步。即便执行了时钟同步,音频播放设备200和音频播放设备300依然为不同的设备,仍存在因为硬件差异(例如晶振频率不同)造成播放速度不同。当播放时间延长后,仍然存在两个设备的播放进度不同步的情况。For the synchronous playback between the audio source device 100 and the audio playback device 200, reference may be made to the description in the previous application scenario, which will not be repeated here. Usually, the audio playback device 200 and the audio playback device 300 are devices of the same manufacturer, and can perform clock synchronization. Even if clock synchronization is performed, the audio playback device 200 and the audio playback device 300 are still different devices, and there are still differences in playback speed due to hardware differences (eg, different crystal oscillator frequencies). When the playback time is prolonged, the playback progress of the two devices is still out of sync.
类似地,音频播放设备300也可以采集音频分片中的预计播放时间点,与音频播放设备300实际开始播放音频分片的时间点(简称为实际播放时间点)的偏差,然后通过线性拟合的方式计算出偏差的变化趋势;并利用该变化趋势去调整音频播放设备300的播放速度,使得音频播放设备300的播放速度与音频源设备100的播放速度保持一致。Similarly, the audio playback device 300 can also collect the expected playback time point in the audio segment, the deviation from the time point when the audio playback device 300 actually starts to play the audio segment (referred to as the actual playback time point), and then perform linear fitting. The variation trend of the deviation is calculated by using the variation trend;
在又一种应用场景中,音频源设备100在播放音频(例如音乐)时,可以将音频直接发送给音频播放设备200,由音频播放设备200播放音频内容。In another application scenario, when the audio source device 100 plays audio (eg, music), the audio may be directly sent to the audio playback device 200 , and the audio playback device 200 plays the audio content.
一般地,音频源设备100向音频播放设备200发送音频流的速度(又称为投放速度),与音频播放设备200的播放速度不同,仍然会造成音频播放设备200在缓存区的数据溢出或者耗尽,出现播放音频时声音卡顿或爆音的情况。因此,音频播放设备200需要调整播放速度至与音频源设备100的投放速度一致,避免造成音频播放设备200处缓存区的数据溢出或者耗尽,进而避免出现播放音频时声音卡顿或爆音的情况。Generally, the speed at which the audio source device 100 sends the audio stream to the audio playback device 200 (also referred to as the delivery speed) is different from the playback speed of the audio playback device 200, which may still cause data overflow or consumption of the audio playback device 200 in the buffer area. However, when the audio is played, the sound becomes stuttered or popped. Therefore, the audio playback device 200 needs to adjust the playback speed to be consistent with the delivery speed of the audio source device 100, so as to avoid data overflow or exhaustion in the buffer area of the audio playback device 200, thereby avoiding the occurrence of sound stuttering or popping when playing audio. .
在又一种应用场景中,音频源设备100在播放音频(例如音乐)时,可以将音频直接发送给音频播放设备200和音频播放设备300,由音频播放设备200和音频播放设备300共同播放音频内容。In yet another application scenario, when the audio source device 100 plays audio (for example, music), the audio can be directly sent to the audio playback device 200 and the audio playback device 300, and the audio playback device 200 and the audio playback device 300 jointly play the audio. content.
那么,音频播放设备200需要调整播放速度至与音频源设备100的投放速度一致,另外音频播放设备300也要调整播放速度与音频源设备100的投放速度一致。Then, the audio playback device 200 needs to adjust the playback speed to be consistent with the delivery speed of the audio source device 100 , and the audio playback device 300 also needs to adjust the playback speed to be consistent with the delivery speed of the audio source device 100 .
下文将结合附图对本申请的技术方案进行阐述。The technical solutions of the present application will be described below with reference to the accompanying drawings.
图4示出了本申请提供的一种音频播放方法的流程。如图4所示,该音频播放的方法,可以包括:FIG. 4 shows the flow of an audio playback method provided by the present application. As shown in Figure 4, the audio playback method may include:
S401、音频源设备100接收到播放音频的指示。S401. The audio source device 100 receives an instruction to play audio.
示例性的,音频源设备100连接有至少一个音频播放设备,例如音频播放设备200。其中,音频源设备100可以通过无线通信模块160与音频播放设备200建立无线连接,采用的无线连接方式例如可以是蓝牙、WLAN、NFC等。Exemplarily, the audio source device 100 is connected with at least one audio playback device, for example, the audio playback device 200 . Wherein, the audio source device 100 may establish a wireless connection with the audio playback device 200 through the wireless communication module 160, and the adopted wireless connection mode may be, for example, Bluetooth, WLAN, NFC, or the like.
当用户在音频源设备100上操作,指示开始播放视频,音频源设备100可以播放视频的画面内容,将视频的音频内容(即音频流)发送给音频播放设备200,由音频播放设备200进行播放,即实现音频的外放。或者,用户在音频源设备100上操作,指示开始播放纯音频(例如音乐、录音等),音频源设备100向音频播放设备200发送音频流。When the user operates on the audio source device 100 and instructs to start playing the video, the audio source device 100 can play the screen content of the video, and send the audio content (that is, the audio stream) of the video to the audio playback device 200 for playback by the audio playback device 200 , that is, to realize the external playback of audio. Alternatively, the user operates on the audio source device 100 and instructs to start playing pure audio (eg, music, recording, etc.), and the audio source device 100 sends the audio stream to the audio playback device 200 .
需要注意的是,音频源设备100播放视频时,音频源设备100侧的画面内容,应当与音频播放设备200侧的音频内容的播放速度应该保持一致。音频源设备100播放纯音频时,音频源设备100向音频播放设备200投送音频流的速度,应当与音频播放设备200侧的音频内容的播放速度应该保持一致。It should be noted that when the audio source device 100 plays a video, the screen content on the audio source device 100 side should be consistent with the playback speed of the audio content on the audio playback device 200 side. When the audio source device 100 plays pure audio, the speed at which the audio source device 100 delivers the audio stream to the audio playback device 200 should be consistent with the playback speed of the audio content on the audio playback device 200 side.
S402、音频源设备100向音频播放设备200发送音频流。S402 , the audio source device 100 sends an audio stream to the audio playback device 200 .
S403、音频播放设备200对接收到的音频流进行编解码。S403. The audio playback device 200 encodes and decodes the received audio stream.
示例性的,以音频播放设备200包括分发模块、调整系数计算模块、播放模块和缓存模块为例进行说明,那么分发模块接收音频源设备100发送的音频流,并对接收到的音频流进行编解码,得到符合自身播放格式的音频数据。编解码与音频播放设备200的声道数、采样位数和采样频率相关,具体的编解码过程可参考相关的音频编解码技术,这里不再说明。Exemplarily, taking the audio playback device 200 including a distribution module, an adjustment coefficient calculation module, a playback module and a buffer module as an example, the distribution module receives the audio stream sent by the audio source device 100, and encodes the received audio stream. Decode to obtain audio data that conforms to its own playback format. The encoding and decoding are related to the number of channels, the number of sampling bits, and the sampling frequency of the audio playback device 200 . For the specific encoding and decoding process, reference may be made to related audio encoding and decoding technologies, which will not be described here.
S404、音频播放设备200对编解码后的音频流进行分片,并根据调整系数计算每个音频分片的预计播放时间点。S404. The audio playback device 200 segments the encoded and decoded audio stream, and calculates the expected playback time point of each audio segment according to the adjustment coefficient.
示例性的,以音频播放设备200包括分发模块、调整系数计算模块、播放模块和缓存模块为例进行说明,那么步骤S404可具体包括步骤S404a至步骤S404e。Exemplarily, taking the audio playback device 200 including a distribution module, an adjustment coefficient calculation module, a playback module and a cache module as an example for description, then step S404 may specifically include steps S404a to S404e.
S404a、分发模块在得到编解码后的音频流后,向调整系数计算模块请求获取当前的调整系数。S404a: After obtaining the encoded and decoded audio stream, the distribution module requests the adjustment coefficient calculation module to obtain the current adjustment coefficient.
S404b、调整系数计算模块向分发模块返回当前的调整系数。S404b, the adjustment coefficient calculation module returns the current adjustment coefficient to the distribution module.
需要注意的是,调整系数计算模块可以周期性地更新调整系数的数值,而分发模块根据当前最新的调整系数计算各个音频分片的预计播放时间点。其中,调整系数计算模块更新调整系数的计算过程可参考下面的步骤S406。其中,调整系数的初始值可以设置为1。It should be noted that the adjustment coefficient calculation module can periodically update the value of the adjustment coefficient, and the distribution module calculates the expected playback time point of each audio segment according to the current latest adjustment coefficient. For the calculation process of updating the adjustment coefficient by the adjustment coefficient calculation module, reference may be made to the following step S406. The initial value of the adjustment coefficient can be set to 1.
S404c、分发模块对编解码后的音频流进行分片,并计算每个音频分片的预计播放时间点。S404c, the distribution module segments the encoded and decoded audio stream, and calculates the expected playback time point of each audio segment.
示例性地,分片后的每个音频分片的数据结构可以为:Exemplarily, the data structure of each audio fragment after fragmentation may be:
data typedata type
-index:int-index:int
-len:int-len:int
-playtime:long long-playtime: long long
其中,index,为音频分片的编号,从1开始编号,依次递增。Among them, index is the number of the audio segment, which starts from 1 and increases sequentially.
len,为音频分片的数据长度。数据长度与音频分片播放时长的关系为:len=声道数*采样位数*采样率*音频分片播放时长/8。其中,音频分片播放时长为预设值,例如为10ms(millisecond,毫秒)。换言之,音频分片的数据长度为固定值。len, is the data length of the audio fragment. The relationship between the data length and the playback duration of the audio segment is: len=number of channels*sampling bits*sampling rate*audio segment playback duration/8. The playback duration of the audio segment is a preset value, for example, 10ms (millisecond, millisecond). In other words, the data length of the audio slice is a fixed value.
举个例子,音频播放设备200的声道数为1,采样位数为32位,采样率为96KHz,音频分片播放时长为10ms,那么len=1*32*96*10/8=3840ms。For example, the number of channels of the audio playback device 200 is 1, the number of sampling bits is 32, the sampling rate is 96KHz, and the playback duration of the audio segment is 10ms, then len=1*32*96*10/8=3840ms.
playtime,为音频分片的预计播放时间点,例如单位为μs(microsecond,微秒)。其中,第一个音频分片(即音频分片1#)的预计播放时间点=当前时间+预设的延迟时间(例如1s)。其中,预设的延迟时间可以使得音频播放设备200缓存音频分片的数据,防止因网络传输抖动造成播放异常。第N个音频分片的预计播放时间点=第一个音频分片的预计播放时间点+(N-1)*音频分片播放时长*1000*调整系数。playtime, is the expected playback time point of the audio segment, for example, the unit is μs (microsecond, microsecond). Wherein, the expected playback time point of the first audio segment (that is, audio segment 1#)=current time+preset delay time (for example, 1s). The preset delay time may enable the audio playback device 200 to cache the data of the audio segment, so as to prevent abnormal playback caused by network transmission jitter. Expected playback time point of the N th audio segment = expected playback time point of the first audio segment + (N-1)*audio segment playback duration*1000*adjustment coefficient.
可见,本申请中音频分片的预计播放时间点为根据第一个音频分片的预计播放时间点,音频分片的编号,以及调整系数确定的。其中,调整系数为根据音频播放设备200中缓存的音频分片的数量进行动态变化的。下述步骤S406将详细说明调整系数的计算方法,此处暂不做说明。其中,调整系数初始值为1。It can be seen that the expected playback time point of the audio segment in this application is determined according to the expected playback time point of the first audio segment, the number of the audio segment, and the adjustment coefficient. The adjustment coefficient is dynamically changed according to the number of audio segments buffered in the audio playback device 200 . The following step S406 will describe the calculation method of the adjustment coefficient in detail, and will not be described here for the time being. The initial value of the adjustment coefficient is 1.
接着上述举例,当前时间为2020/11/11 00:00:00.000 000。预设的延迟时间为1s,那么第一个音频分片的预计播放时间点为2020/11/11 00:00:01.000 000。第二个音频分片的预计播放时间点=2020/11/11 00:00:01.000 000+(2-1)*10*1=2020/11/11 00:00:01.010 000=2020/11/11 00:00:01.010 000。Following the above example, the current time is 2020/11/11 00:00:00.000 000. The preset delay time is 1s, then the expected playback time point of the first audio segment is 2020/11/11 00:00:01.000 000. Expected playback time point of the second audio segment = 2020/11/11 00:00:01.000 000+(2-1)*10*1=2020/11/11 00:00:01.010 000=2020/11/ 11 00:00:01.010 000.
第1个音频分片的数据信息如表一所示:The data information of the first audio segment is shown in Table 1:
表一Table I
index index 11
lenlen 38403840
playTimeplayTime 2020/11/11 00:00:01.000 0002020/11/11 00:00:01.000 000
第2个音频分片的数据信息如表二所示:The data information of the second audio segment is shown in Table 2:
表二Table II
index index 22
lenlen 38403840
playTimeplayTime 2020/11/11 00:00:01.010 0002020/11/11 00:00:01.010 000
需要说明的是,本文中均以音频分片播放时长的单位为ms(毫秒),音频分片的预计播放时间点为μs(微秒)为例进行说明的,下文不再特别说明。It should be noted that, in this article, the unit of the audio segment playback time is ms (milliseconds), and the expected playback time point of the audio segment is μs (microseconds).
还需要说明的是,在一些实施例中,分发模块可以周期性向调整系数计算模块请求当前的调整系数,以便根据当前的调整系数计算音频分片的预计播放时间点。或者,分发模块也可以在接收到特定数据量的音频流数据后,向调整系数计算模块请求当前的调整系数,以便根据当前的调整系数计算音频分片的预计播放时间点。在另一些实施例中,当调整系数计算模块更新调整系数后,也可以将更新后的调整系数发送给分发模块,以便分发模块根据更新后的调整系数计算音频分片的预计播放时间点。换言 之,分发模块可以被动地接收调整系数计算模块发送的最新的调整系数计算音频分片的预计播放时间点。It should also be noted that, in some embodiments, the distribution module may periodically request the current adjustment coefficient from the adjustment coefficient calculation module, so as to calculate the expected playback time point of the audio segment according to the current adjustment coefficient. Alternatively, the distribution module may also request the current adjustment coefficient from the adjustment coefficient calculation module after receiving the audio stream data of a specific amount of data, so as to calculate the expected playback time point of the audio segment according to the current adjustment coefficient. In other embodiments, after the adjustment coefficient is updated by the adjustment coefficient calculation module, the updated adjustment coefficient may also be sent to the distribution module, so that the distribution module calculates the expected playback time point of the audio segment according to the updated adjustment coefficient. In other words, the distribution module can passively receive the latest adjustment coefficient sent by the adjustment coefficient calculation module to calculate the expected playback time point of the audio segment.
S404d、分发模块向缓存模块发送各个音频分片。S404d. The distribution module sends each audio segment to the cache module.
分发模块将生成的各个音频分片依次发送到缓存模块,进行缓存。各个音频分片携带有预计播放时间点。The distribution module sends the generated audio segments to the cache module in turn for caching. Each audio segment carries an estimated playback time point.
S404e、缓存模块缓存各个音频分片。S404e, the cache module caches each audio fragment.
S405、当前时间为第一个音频分片(即音频分片1#)的预计播放时间点时,音频源设备100启动音频播放。S405. When the current time is the expected playback time point of the first audio segment (that is, audio segment 1#), the audio source device 100 starts audio playback.
示例性的,仍然以音频播放设备200包括分发模块、调整系数计算模块、播放模块和缓存模块为例进行说明。那么,步骤S405可具体包括步骤S405a和步骤S405b。Exemplarily, the audio playback device 200 includes a distribution module, an adjustment coefficient calculation module, a playback module, and a cache module as an example for description. Then, step S405 may specifically include step S405a and step S405b.
S405a、当前时间晚于或等于第一个音频分片的预计播放时间点时,分发模块通知播放模块启动音频播放。S405a, when the current time is later than or equal to the expected playback time point of the first audio segment, the distribution module notifies the playback module to start audio playback.
S405b、播放模块从缓存模块读取第一个音频分片的数据以及之后的音频分片的数据,并开始依次播放各个音频分片。S405b, the playback module reads the data of the first audio segment and the data of the subsequent audio segments from the cache module, and starts to play each audio segment in sequence.
在一些实施例中,播放模块从缓存模块读取音频分片的数据后,将音频分片的数据写入播放模块中的音频输出驱动,例如高级Linux声音架构(advanced linux sound architecture,ALSA)。并通过音频输出驱动的接口获取当前写入的音频分片对应的预期输出时间,该预期输出时间可认为是该音频分片的实际播放时间点。In some embodiments, after the playback module reads the data of the audio segment from the cache module, the data of the audio segment is written into an audio output driver in the playback module, such as an advanced Linux sound architecture (ALSA). The expected output time corresponding to the currently written audio segment is obtained through the interface of the audio output driver, and the expected output time may be considered as the actual playback time point of the audio segment.
情况1、若音频分片的实际播放时间点晚于音频分片中携带的预期播放时间点,表明音频播放设备200的播放进度慢于音频源设备100的播放进度,播放模块可以通知缓存模块删除部分音频分片,以便音频播放设备200的播放进度尽快与音频源设备100的播放进度相同。例如,音频播放设备200删除的音频分片的数量可以根据所述差值的绝对值除以音频分片播放时长(即每个音频分片的数据长度)的商确定。若所述差值的绝对值除以音频分片播放时长(即每个音频分片的数据长度)的商为整数,则该数量等于所述差值的绝对值除以音频分片播放时长的商。若所述差值的绝对值除以音频分片播放时长的商不为整数,则可以对该商进行取整,将取整会后得到整数作为删除音频分片的数量。其中,取整方法可以是四舍五入,向上取整,向下取整等。 Case 1. If the actual playback time point of the audio fragment is later than the expected playback time point carried in the audio fragment, it indicates that the playback progress of the audio playback device 200 is slower than the playback progress of the audio source device 100, and the playback module can notify the cache module to delete it. Part of the audio is fragmented so that the playback progress of the audio playback device 200 is the same as the playback progress of the audio source device 100 as soon as possible. For example, the number of audio segments deleted by the audio playback device 200 may be determined according to the quotient of dividing the absolute value of the difference by the playback duration of the audio segments (ie, the data length of each audio segment). If the quotient of the absolute value of the difference divided by the playback duration of the audio segment (that is, the data length of each audio segment) is an integer, then the number is equal to the absolute value of the difference divided by the playback duration of the audio segment. business. If the quotient of the absolute value of the difference divided by the playing duration of the audio segment is not an integer, the quotient may be rounded, and the integer obtained after the rounding will be used as the number of deleted audio segments. Among them, the rounding method may be rounding, rounding up, rounding down, etc.
例如,若音频分片的实际播放时间点减去预期播放时间点的差值为0.6s(600ms),每个音频分片的数据长度为10ms,那么需要缓存模块删除600ms/10ms=60个音频分片。For example, if the difference between the actual playback time point of the audio segment minus the expected playback time point is 0.6s (600ms), and the data length of each audio segment is 10ms, then the cache module needs to delete 600ms/10ms=60 audios Fragmentation.
情况2、若音频分片的实际播放时间点早于音频分片中携带的预期播放时间点,表明音频播放设备200的播放进度快于音频源设备100的播放进度,播放模块可以通知缓存模块增加部分音频分片。其中,增加的音频分片可以是静音的音频数据,也可以是复制当前写入的音频分片数据或者是其他的音频数据。这样可以相当于音频播放设备200等待相应时间后播放后续的音频分片,可使得音频播放设备200与音频源设备100的播放进度相同。 Case 2. If the actual playback time point of the audio fragment is earlier than the expected playback time point carried in the audio fragment, it indicates that the playback progress of the audio playback device 200 is faster than the playback progress of the audio source device 100, and the playback module can notify the cache module to increase Partial audio fragmentation. The added audio segment may be muted audio data, or may be copying the currently written audio segment data or other audio data. This can be equivalent to the audio playback device 200 playing subsequent audio segments after waiting for a corresponding time, so that the audio playback device 200 and the audio source device 100 have the same playback progress.
增加的音频分片的数量可以参考情况1中删除的音频分片的数量的计算方法类似,这里不再赘述。For the number of added audio segments, reference may be made to the calculation method of the number of deleted audio segments in Case 1, which is not repeated here.
例如,若音频分片的实际播放时间点减去预期播放时间点的差值为-0.6s(600ms),每个音频分片的数据长度为10ms,那么需要缓存模块增加600ms/10ms=60个音频分片。 增加的音频分片的数据为全0,即静音数据。For example, if the difference between the actual playback time point of the audio segment minus the expected playback time point is -0.6s (600ms), and the data length of each audio segment is 10ms, then the cache module needs to be increased by 600ms/10ms=60 Audio fragmentation. The data of the added audio segment is all 0, that is, the mute data.
需要强调的是,前文已说明,这里各个音频分片携带的预计播放时间点(playtime)为根据音频分片的编号,第一个音频分片的预计播放时间点,以及调整系数确定的。其中,调整系数为根据音频播放设备200中缓存的音频分片的数量进行动态变化的。以下详细说明调整系数的计算以及更新过程。It should be emphasized that, as explained above, the expected playtime (playtime) carried by each audio segment here is determined according to the number of the audio segment, the expected playback time of the first audio segment, and the adjustment coefficient. The adjustment coefficient is dynamically changed according to the number of audio segments buffered in the audio playback device 200 . The calculation and update process of the adjustment coefficient will be described in detail below.
具体地,当音频播放设备200确定当前时间为第一个音频分片的预计播放时间点时,启动音频播放后,还记录音频播放设备200中缓存的音频分片的数量与时间的变化特征,并根据该变化特征计算调整系数。也就是说,在音频播放设备200执行步骤S405时,还执行步骤S406,具体如下:Specifically, when the audio playback device 200 determines that the current time is the expected playback time point of the first audio fragment, after the audio playback is started, it also records the number of audio fragments buffered in the audio playback device 200 and the time variation characteristics, And calculate the adjustment coefficient according to the change characteristic. That is to say, when the audio playback device 200 performs step S405, step S406 is also performed, and the details are as follows:
S406、音频源设备100记录音频播放设备200中缓存的音频分片的数量与时间的变化特征,并根据该变化特征计算调整系数。S406. The audio source device 100 records the variation characteristics of the number and time of the audio segments buffered in the audio playback device 200, and calculates an adjustment coefficient according to the variation characteristics.
示例性的,仍然以音频播放设备200包括分发模块、调整系数计算模块、播放模块和缓存模块为例进行说明。那么,步骤S406可具体包括步骤S406a至步骤S406d。Exemplarily, the audio playback device 200 includes a distribution module, an adjustment coefficient calculation module, a playback module, and a cache module as an example for description. Then, step S406 may specifically include steps S406a to S406d.
S406a、播放模块在接收到启动音频播放的通知后,通知调整系数计算模块开始周期性采集缓存模块中音频分片的数量。S406a. After receiving the notification of starting audio playback, the playback module notifies the adjustment coefficient calculation module to periodically collect the number of audio fragments in the cache module.
在一些实施例中,播放模块可以在接收到启动音频播放的通知后,就立即通知调整系数计算模块开始周期性采集缓存模块中音频分片的数量,也可以在接收到启动音频播放的通知后的一段时间(例如1秒)后,再通知调整系数计算模块开始采集。在另一些实施例中,播放模块还可以在检测到某个音频分片的预计播放时间点与实际播放时间点的差值的绝对值大于预设阈值A(也可以是其他阈值)时,通知调整系数计算模块开始采集。也就是说,本申请对调整系数计算模块开始周期性采集缓存模块中音频分片的数量的时机不做具体限定。In some embodiments, the playback module may immediately notify the adjustment coefficient calculation module to periodically collect the number of audio fragments in the cache module after receiving the notification for starting audio playback, or after receiving the notification for starting audio playback After a period of time (for example, 1 second), the adjustment coefficient calculation module is notified to start the collection. In other embodiments, the playback module may further notify the player when detecting that the absolute value of the difference between the expected playback time point of an audio segment and the actual playback time point is greater than the preset threshold A (or other thresholds) The adjustment coefficient calculation module starts to collect. That is to say, the present application does not specifically limit the timing when the adjustment coefficient calculation module starts to periodically collect the number of audio fragments in the buffer module.
S406b、调整系数计算模块周期性采集缓存模块中音频分片的数量。S406b, the adjustment coefficient calculation module periodically collects the number of audio fragments in the buffer module.
在一些实施例中,调整系数计算模块可以设置定时器,然后周期性向缓存模块采集当前缓存模块中存储的音频分片的数量,例如每200μs采集一次。在另一些实施例中,调整系数计算模块也可以指示缓存模块周期性上报自身存储的音频分片的数量。即调整系数计算模块向缓存模块发送周期性采集缓存的音频分片的数量的指示。缓存模块接收到指示后,设置定时器,并周期性上报自身存储的音频分片的数量。In some embodiments, the adjustment coefficient calculation module may set a timer, and then periodically collect the number of audio fragments currently stored in the buffer module from the buffer module, for example, once every 200 μs. In other embodiments, the adjustment coefficient calculation module may also instruct the cache module to periodically report the number of audio fragments stored by itself. That is, the adjustment coefficient calculation module sends an indication of the number of audio fragments that are periodically collected and buffered to the buffering module. After receiving the instruction, the cache module sets a timer and periodically reports the number of audio fragments stored by itself.
S406c、调整系数计算模块存储采集的缓存模块中的音频分片的数量与采集时间点。S406c, the adjustment coefficient calculation module stores the number of the collected audio fragments and the collection time point in the cache module.
示例性地,调整系数计算模块将采集时间点,以及各个采集时间点采集的缓存模块中音频分片的数量记录在采样队列1中,采样队列1中存储的内容如表三所示。Exemplarily, the adjustment coefficient calculation module records the collection time point and the number of audio fragments in the buffer module collected at each collection time point in the sampling queue 1, and the content stored in the sampling queue 1 is shown in Table 3.
表三Table 3
Figure PCTCN2021136897-appb-000001
Figure PCTCN2021136897-appb-000001
S406d、当满足预设条件时,调整系数计算模块根据采集的缓存模块中的音频分片 的数据与采集时间点,计算并更新调整系数。S406d, when the preset conditions are met, the adjustment coefficient calculation module calculates and updates the adjustment coefficient according to the data of the audio fragment in the collected buffer module and the collection time point.
其中,预设条件可以为采样队列1中的数据条数(如表三中每一行数据为一条数据)达到预定数量,例如100条,也可以为上一次计算并更新调整系数后的预设时间段(例如3分钟)。The preset condition may be that the number of data pieces in the sampling queue 1 (for example, each row of data in Table 3 is one piece of data) reaches a predetermined number, such as 100 pieces, or it may be the preset time after the last calculation and update of the adjustment coefficient segment (eg 3 minutes).
一些实施例中,调整系数计算模块可以对采样队列1中的数据进行线性拟合,得到采样队列1中(即缓存模块中)音频分片的变化趋势,即采样队列1中音频分片数量与时间的关系。In some embodiments, the adjustment coefficient calculation module can perform linear fitting on the data in the sampling queue 1 to obtain the change trend of the audio fragments in the sampling queue 1 (that is, in the cache module), that is, the number of audio fragments in the sampling queue 1 is the same as the number of audio fragments in the sampling queue 1. time relationship.
通常,音频源设备100根据自身的播放进度,向音频播放设备200发送音频流。音频播放设备100的缓存模块中对接收的音频流进行缓存,得到缓存队列。可以理解的是,缓存队列的头部为根据先接收到的音频流得到音频分片,缓存队列尾部为后接收到的音频流得到的音频分片。可见,音频源设备100的播放速度(或者向音频播放设备200投放音频流的速度,简称为投放速度)影响着缓存队列尾部音频分片的增加速度。当音频播放设备200启动音频播放时,从缓存队列的头部获取音频分片进行播放,并删除已播放的音频分片。可见,音频播放设备200的播放速度影响着缓存队列头部音频分片的减少速度。综合来看,音频源设备100的播放速度(或投放速度)与音频播放设备200播放速度的差异体现在缓存队列中音频分片数量的变化趋势上。由于音频源设备100和音频播放设备200之间的网络传输情况也影响着缓存队列中个别时间点时音频分片数量,因此可以通过线性拟合的方式,排除个别异常数据。Generally, the audio source device 100 sends an audio stream to the audio playback device 200 according to its own playback progress. The buffering module of the audio playback device 100 buffers the received audio stream to obtain a buffer queue. It can be understood that the head of the buffer queue is the audio fragment obtained from the audio stream received first, and the tail of the buffer queue is the audio fragment obtained from the audio stream received later. It can be seen that the playback speed of the audio source device 100 (or the speed of delivering the audio stream to the audio playback device 200, referred to as the delivery speed for short) affects the increasing speed of the audio fragments at the end of the cache queue. When the audio playback device 200 starts audio playback, it acquires audio segments from the head of the cache queue for playback, and deletes the played audio segments. It can be seen that the playback speed of the audio playback device 200 affects the reduction speed of the audio fragment at the head of the buffer queue. On the whole, the difference between the playback speed (or delivery speed) of the audio source device 100 and the playback speed of the audio playback device 200 is reflected in the changing trend of the number of audio fragments in the buffer queue. Since the network transmission between the audio source device 100 and the audio playback device 200 also affects the number of audio fragments at individual time points in the buffer queue, individual abnormal data can be excluded by linear fitting.
具体的,如图5所示,以时间为X轴,缓存模块中音频分片的数量为Y轴,在二维平面上绘制离散点。换言之,图5中各个离散点用于表征在相应采集时间点采集到的音频分片的数量。然后,通过线性回归方式对离散点进行线性回归后得到图5中的直线,该直线的斜率(记为斜率1)表示着缓存模块中音频分片的增加或减少的变化趋势,即每单位时间内增加或减少音频分片的数量。斜率1为正值时,表示每单位时间内增加的音频分片的数量,也意味着音频源设备100的播放速度(或投放速度)快于音频播放设备200的播放速度。斜率1为负值时,表示每单位时间内减少的音频分片的数量,也意味着音频源设备100的播放速度(或投放速度)慢于音频播放设备200的播放速度。Specifically, as shown in FIG. 5 , with time as the X-axis, and the number of audio fragments in the cache module as the Y-axis, discrete points are drawn on a two-dimensional plane. In other words, each discrete point in FIG. 5 is used to represent the number of audio segments collected at the corresponding collection time point. Then, the straight line in Figure 5 is obtained after linear regression is performed on the discrete points. The slope of the straight line (denoted as slope 1) represents the change trend of the increase or decrease of audio fragments in the cache module, that is, per unit time. Increase or decrease the number of audio slices within. When the slope 1 is a positive value, it indicates the number of audio segments added per unit time, which also means that the playback speed (or delivery speed) of the audio source device 100 is faster than the playback speed of the audio playback device 200 . When the slope 1 is a negative value, it indicates the number of audio segments reduced per unit time, which also means that the playback speed (or delivery speed) of the audio source device 100 is slower than the playback speed of the audio playback device 200 .
那么,可以采用公式(1)计算调整系数:Then, formula (1) can be used to calculate the adjustment coefficient:
调整系数=1-(斜率1*音频分片播放时长的数值/1000)      公式(1)Adjustment coefficient=1-(slope 1*value of audio segment playback duration/1000) Formula (1)
其中,音频分片播放时长的数值是在音频分片播放时长的单位为毫秒时的数值;该数值没有单位。调整系数、斜率1都没有单位。The value of the playback duration of the audio segment is the value when the unit of the playback duration of the audio segment is milliseconds; the value has no unit. Adjustment factor and slope 1 have no unit.
调整系数计算模块得到最新的调整系数后,清空采集队列中数据。后续,会根据下一周期内采集的缓存模块中音频分片的数量,计算下一周期的调整系数。After the adjustment coefficient calculation module obtains the latest adjustment coefficient, the data in the collection queue is cleared. Subsequently, the adjustment coefficient for the next cycle will be calculated according to the number of audio fragments in the cache module collected in the next cycle.
步骤S404c中已得到公式(2):Formula (2) has been obtained in step S404c:
第N个音频分片的预计播放时间点=第一个音频分片的预计播放时间点+(N-1)*音频分片播放时长*1000*调整系数        公式(2)The expected playback time point of the Nth audio segment = the expected playback time point of the first audio segment+(N-1)*audio segment playback duration*1000*adjustment coefficient formula (2)
可以理解,当斜率1为正值时,意味着音频源设备100的播放速度(或投放速度)快于音频播放设备200的播放速度,那么音频源设备100的播放进度也快于音频播放 设备200的播放进度。根据公式(1)可推导出调整系数小于1,那么再根据公式(2)可推出预计第N个音频分片的预计播放时间点也变小了(相比较于调整系数为1时),即第N个音频分片的预计播放时间点提前了。换言之,音频播放设备200的播放进度加快了,便于尽快追赶到音频源设备100的播放进度。It can be understood that when the slope 1 is a positive value, it means that the playback speed (or delivery speed) of the audio source device 100 is faster than the playback speed of the audio playback device 200, then the playback progress of the audio source device 100 is also faster than that of the audio playback device 200. playback progress. According to formula (1), it can be deduced that the adjustment coefficient is less than 1, then according to formula (2), it can be deduced that the expected playback time point of the Nth audio segment is also smaller (compared to when the adjustment coefficient is 1), that is The estimated playback time of the Nth audio segment is advanced. In other words, the playback progress of the audio playback device 200 is accelerated, which facilitates catching up with the playback progress of the audio source device 100 as soon as possible.
当斜率1为负值时,意味着音频源设备100的播放速度(或投放速度)慢于音频播放设备200的播放速度,那么音频源设备100的播放进度也慢于音频播放设备200的播放进度。根据公式(1)可推导出调整系数大于1,那么再根据公式(2)可推出预计第N个音频分片的预计播放时间点也变大了(相比较于调整系数为1时),即第N个音频分片的预计播放时间点延后了。换言之,音频播放设备200的播放进度放慢了,便于与音频源设备100的播放进度相同。When the slope 1 is negative, it means that the playback speed (or delivery speed) of the audio source device 100 is slower than the playback speed of the audio playback device 200, and the playback progress of the audio source device 100 is also slower than that of the audio playback device 200. . According to formula (1), it can be deduced that the adjustment coefficient is greater than 1, then according to formula (2), it can be deduced that the expected playback time point of the N-th audio segment is also larger (compared to when the adjustment coefficient is 1), that is The estimated playback time of the Nth audio segment is delayed. In other words, the playback progress of the audio playback device 200 is slowed down so as to be the same as the playback progress of the audio source device 100 .
综上,由于音频播放设备200中缓存的音频分片的数量变化趋势体现着音频播放设备200播放速度(或投放速度)与音频源设备100播放速度的差异,因此音频播放设备200根据音频播放设备200中缓存的音频分片的数量变化趋势更新调整系数,再根据调整系数计算音频分片的预计播放时间点,可以使得音频播放设备200播放速度与音频源设备100的播放速度(或投放速度)一致。进而可以避免造成音频播放设备200处缓存区的数据溢出或者耗尽,进而避免出现播放音频时声音卡顿或爆音的情况,提升外放音频的收听体验。To sum up, since the changing trend of the number of audio fragments cached in the audio playback device 200 reflects the difference between the playback speed (or delivery speed) of the audio playback device 200 and the playback speed of the audio source device 100, The change trend of the number of cached audio fragments in 200 updates the adjustment coefficient, and then calculates the expected playback time point of the audio fragment according to the adjustment coefficient, which can make the playback speed of the audio playback device 200 and the playback speed (or delivery speed) of the audio source device 100. Consistent. In this way, data overflow or exhaustion in the buffer area of the audio playback device 200 can be avoided, thereby avoiding the situation of sound stuttering or popping when playing the audio, and improving the listening experience of the externally played audio.
在本申请的另一些实施例中,由于两个设备存在硬件差异(例如晶振频率不同)造成音频播放设备200与音频源设备100的播放进度不一致的情况。为此,音频播放设备200还可以调整自身的播放速度,以使得自身的播放速度与音频源设备100的播放速度一致,使得音频播放设备200能在播放进度上与音频源设备100长时间地保持一致。In other embodiments of the present application, the audio playback device 200 and the audio source device 100 may have inconsistent playback progress due to hardware differences (eg, different crystal oscillator frequencies) between the two devices. To this end, the audio playback device 200 can also adjust its own playback speed, so that its own playback speed is consistent with the playback speed of the audio source device 100, so that the audio playback device 200 can maintain the playback progress with the audio source device 100 for a long time. Consistent.
示例性地,以音频播放设备200包括分发模块、播放模块、缓存模块以及播速模块为例进行说明。Exemplarily, the audio playback device 200 includes a distribution module, a playback module, a cache module, and a playback speed module as an example for description.
图6示出了本申请提供的又一种音频播放方法的流程。如6所示,该音频播放的方法包括步骤S401-步骤S403、步骤S404c-步骤S404e、步骤S405a、步骤S601至步骤S611,以及步骤S406。FIG. 6 shows the flow of another audio playback method provided by the present application. As shown in 6, the audio playback method includes steps S401-S403, S404c-S404e, S405a, S601-S611, and S406.
其中,步骤S401-步骤S403、步骤S404c-步骤S404e、步骤S405a、以及步骤S406请参考图4中相关内容的描述,这里不再赘述。Wherein, step S401-step S403, step S404c-step S404e, step S405a, and step S406, please refer to the description of the relevant content in FIG. 4, and will not be repeated here.
S601、播放模块从缓存模块获取音频分片的内容。S601. The playback module obtains the content of the audio segment from the cache module.
在当前时间等于第一个音频分片的预计播放时间点时,分发模块通知播放模块启动音频播放,播放模块开始从缓存模块中依次读取第一个音频分片以及之后的音频分片的内容。When the current time is equal to the expected playback time point of the first audio segment, the distribution module notifies the playback module to start audio playback, and the playback module starts to sequentially read the content of the first audio segment and subsequent audio segments from the cache module .
S602、播放模块将获取的音频分片内容写入音频输出驱动。S602. The playback module writes the acquired audio segment content to the audio output driver.
播放模块依次将读取的音频分片写入播放模块中的音频输出驱动(例如ALSA),并通过音频输出驱动播放当前写入的音频分片。The playback module sequentially writes the read audio segments into the audio output driver (eg ALSA) in the playback module, and plays the currently written audio segment through the audio output driver.
S603、播放模块调用音频输出驱动的接口查询当前写入的音频分片的扬声器预期输出时间,即当前写入的音频分片的实际播放时间点。S603: The playback module invokes the interface of the audio output driver to query the speaker expected output time of the currently written audio segment, that is, the actual playback time point of the currently written audio segment.
当播放模块向音频输出驱动写入音频分片后,播放模块还调用音频输出驱动的接 口查询当前写入的音频分片的扬声器预期输出时间,该音频分片的扬声器预期输出时间可认为是该音频分片的实际播放时间点。After the playback module writes the audio segment to the audio output driver, the playback module also calls the interface of the audio output driver to query the speaker expected output time of the currently written audio segment, and the speaker expected output time of the audio segment can be considered to be the The actual playback time point of the audio segment.
S604、播放模块计算音频分片实际播放时间点和音频分片中携带的预计播放时间点(如playtime)的差值,并确定该差值的绝对值是否大于预设阈值A。S604: The playback module calculates the difference between the actual playback time point of the audio fragment and the expected playback time point (eg, playtime) carried in the audio fragment, and determines whether the absolute value of the difference is greater than the preset threshold A.
在一些实施例中,可以通过确定该差值的绝对值与预设阈值A(例如1秒)的大小。该差值的绝对值大于预设阈值A,表明音频播放设备200与音频源设备100的播放进度差异较大,可执行步骤S605,以快速减小两个设备的播放进度差异。该差值的绝对值小于或等于预设阈值A,表明音频播放设备200与音频源设备100的播放进度差异较小,可以执行步骤S606-步骤S611。即通过调整音频播放设备200的播放速度,使得音频播放设备200与音频源设备100的播放进度一致,并使得音频播放设备200的播放速度与音频源设备100的播放速度一致。当然,在其他一些实施例中,不用区分音频分片实际播放时间点和预计播放时间点(如playtime)的差值的绝对值与预设阈值A的大小,直接通过后面步骤调整音频播放设备200的播放速度。也就是说,播放模块计算得到音频分片实际播放时间点和音频分片中携带的预计播放时间点(如playtime)的差值后,也可以不用比较该差值的绝对值与预设阈值A的大小,而是直接执行步骤S606。In some embodiments, the magnitude of the absolute value of the difference and a preset threshold value A (for example, 1 second) can be determined. The absolute value of the difference is greater than the preset threshold A, indicating that the playback progress difference between the audio playback device 200 and the audio source device 100 is relatively large. Step S605 can be executed to quickly reduce the playback progress difference between the two devices. The absolute value of the difference is less than or equal to the preset threshold A, indicating that the difference between the playback progress of the audio playback device 200 and the audio source device 100 is small, and steps S606 to S611 may be performed. That is, by adjusting the playback speed of the audio playback device 200 , the playback progress of the audio playback device 200 and the audio source device 100 are consistent, and the playback speed of the audio playback device 200 is consistent with the playback speed of the audio source device 100 . Of course, in some other embodiments, it is not necessary to distinguish the absolute value of the difference between the actual playback time point of the audio segment and the expected playback time point (such as playtime) and the size of the preset threshold A, and the audio playback device 200 is adjusted directly through the following steps. playback speed. That is to say, after the playback module calculates the difference between the actual playback time point of the audio segment and the expected playback time point (such as playtime) carried in the audio segment, it is not necessary to compare the absolute value of the difference with the preset threshold A. size, but directly execute step S606.
S605、播放模块通知缓存模块删除或增加音频分片。S605. The playback module notifies the cache module to delete or add audio segments.
在确定音频分片的实际播放时间点与预计播放时间点差值大于预设阈值A,还进一步根据实际播放时间点和预计播放时间点的相对大小,确定删除一个或多个音频分片,还是增加一个或多个音频分片。After determining that the difference between the actual playback time point of the audio segment and the expected playback time point is greater than the preset threshold A, it is further determined according to the relative size of the actual playback time point and the expected playback time point to delete one or more audio segments, or whether to delete one or more audio segments. Add one or more audio slices.
若音频分片的实际播放时间点晚于音频分片中携带的预期播放时间,表明音频播放设备200的播放进度慢于音频源设备100的播放进度,播放模块可以通知缓存模块删除部分音频分片,以便音频播放设备200的播放进度尽快与音频源设备100的播放进度相同。If the actual playback time of the audio fragment is later than the expected playback time carried in the audio fragment, it indicates that the playback progress of the audio playback device 200 is slower than the playback progress of the audio source device 100, and the playback module may notify the cache module to delete some audio fragments , so that the playback progress of the audio playback device 200 is the same as the playback progress of the audio source device 100 as soon as possible.
若音频分片的实际播放时间点早于音频分片中携带的预期播放时间,表明音频播放设备200的播放进度快于音频源设备100的播放进度,播放模块可以通知缓存模块增加部分音频分片。其中,增加的音频分片可以是静音的音频数据,也可以是复制当前写入的音频分片数据或者是其他的音频数据,这样可以相当于音频播放设备200等待相应时间后播放后续的音频分片,可使得音频播放设备200与音频源设备100的播放进度相同。If the actual playback time point of the audio fragment is earlier than the expected playback time carried in the audio fragment, it indicates that the playback progress of the audio playback device 200 is faster than the playback progress of the audio source device 100, and the playback module may notify the cache module to add some audio fragments . The added audio segment may be muted audio data, or copy the currently written audio segment data or other audio data, which can be equivalent to the audio playback device 200 playing the subsequent audio segment after waiting for a corresponding time. The audio playback device 200 and the audio source device 100 have the same playback progress.
后续,播放模块进行监测后续的音频分片的实际播放时间点和预计播放时间点的差值,判断差值是否大于预设阈值A,进一步采取相应的方法。Subsequently, the playback module monitors the difference between the actual playback time point of the subsequent audio segment and the expected playback time point, determines whether the difference is greater than the preset threshold A, and further adopts a corresponding method.
S606、播放模块向播速模块发送音频分片的实际播放时间点与其对应的差值。S606: The playback module sends the actual playback time point of the audio segment to the playback speed module and the difference value corresponding to it.
由于要调整音频播放设备200的播放速度,因此需要计算音频播放设备200的实际播放时间点与差值的变化趋势,因此,当播放模块确定差值小于预设阈值A时,将该差值以及该差值对应的实际播放时间点发送给播速模块。Since the playback speed of the audio playback device 200 needs to be adjusted, it is necessary to calculate the actual playback time point of the audio playback device 200 and the change trend of the difference. Therefore, when the playback module determines that the difference is less than the preset threshold A, the difference and The actual playback time point corresponding to the difference is sent to the playback speed module.
S607、播速模块存储音频分片的实际播放时间点与其对应的差值。S607: The playback speed module stores the difference between the actual playback time point of the audio segment and its corresponding value.
示例性的,播速模块将接收到的音频分片的实际播放时间点与其对应的差值记录在采样队列2中,采样队列2中存储的内容如表四所示。Exemplarily, the playback speed module records the difference between the actual playback time point of the received audio segment and its corresponding difference in the sampling queue 2, and the content stored in the sampling queue 2 is shown in Table 4.
表四Table 4
Figure PCTCN2021136897-appb-000002
Figure PCTCN2021136897-appb-000002
S608、播速模块根据音频分片的实际播放时间点与其对应的差值,计算播放速度的偏差。S608: The playback speed module calculates the deviation of the playback speed according to the difference between the actual playback time point of the audio segment and its corresponding value.
当播速模块确定满足一定条件时,可以根据采样队列2中的数据计算音频播放设备200与音频源设备100的播放速度的偏差。该一定条件例如可以为采样队列2中的数据条数(如表四中每一行数据为一条数据)达到预定数量,例如100条等。When the playback speed module determines that a certain condition is met, the deviation between the playback speeds of the audio playback device 200 and the audio source device 100 may be calculated according to the data in the sampling queue 2 . The certain condition may be, for example, that the number of data pieces in the sampling queue 2 (for example, each row of data in Table 4 is one piece of data) reaches a predetermined number, for example, 100 pieces of data.
一些实施例中,调整系数计算模块可以对采样队列2中的数据进行线性拟合,得到音频播放设备200与音频源设备100的播放速度的偏差变化趋势。In some embodiments, the adjustment coefficient calculation module may perform linear fitting on the data in the sampling queue 2 to obtain a variation trend of the deviation between the playback speeds of the audio playback device 200 and the audio source device 100 .
具体的,以实际播放时间点为X轴,实际播放时间点和预计播放时间点的差值为Y轴,在二维平面上绘制离散点。并对离散点进行线性回归后得到捏合的直线,该直线的斜率(记为斜率2)表示着实际播放速度与预期播放速度的偏差,即每单位时间偏差多少时间。如图7中(1)所示,拟合的直线的斜率2为正值时,表示音频播放设备200的实际播放速度比预期播放速度慢,需要调快音频播放设备200的播放速度。如图7中(2)所示,拟合的直线斜率2为负值时,表示音频播放设备200的实际播放速度比预期播放速度快,需要调慢音频播放设备200的播放速度。若斜率2为零时,音频播放设备200的实际播放速度与预期播放速度相同,无需调整音频播放设备200的播放速度。Specifically, taking the actual playback time point as the X-axis, and the difference between the actual playback time point and the expected playback time point as the Y-axis, the discrete points are drawn on the two-dimensional plane. After performing linear regression on the discrete points, a kneaded straight line is obtained. The slope of the straight line (referred to as slope 2) represents the deviation between the actual playback speed and the expected playback speed, that is, how much time the deviation per unit time is. As shown in (1) in FIG. 7 , when the slope 2 of the fitted straight line is a positive value, it means that the actual playback speed of the audio playback device 200 is slower than the expected playback speed, and the playback speed of the audio playback device 200 needs to be increased. As shown in (2) in FIG. 7 , when the fitted straight line slope 2 is negative, it means that the actual playback speed of the audio playback device 200 is faster than the expected playback speed, and the playback speed of the audio playback device 200 needs to be slowed down. If the slope 2 is zero, the actual playback speed of the audio playback device 200 is the same as the expected playback speed, and there is no need to adjust the playback speed of the audio playback device 200 .
S609、播速模块根据播放速度的偏差计算目标播放速度。S609, the playback speed module calculates the target playback speed according to the deviation of the playback speed.
其中,目标播放速度为期望的音频播放设备200的播放速度。那么,可以采用公式(3)计算目标播放速度:The target playback speed is the desired playback speed of the audio playback device 200 . Then, formula (3) can be used to calculate the target playback speed:
目标播放速度=当前播放速度*(1+斜率2)        公式(3)Target playback speed = current playback speed * (1 + slope 2) formula (3)
可以理解,当斜率2为正值时,根据公式(3)计算得到的目标播放速度将增大。当斜率2为负值时,根据公式(3)计算得到的目标播放速度将减小。It can be understood that when the slope 2 is a positive value, the target playback speed calculated according to formula (3) will increase. When the slope 2 is a negative value, the target playback speed calculated according to formula (3) will decrease.
在另一些示例中,也可以设置预设阈值B。当斜率2的绝对值小于预设阈值B时,可认为音频播放设备200的实际播放时间点和预计播放时间点差异很小,可以不用调整音频播放设备200的播放速度。当斜率2的绝对值大于或等于预设阈值B时,再采用公式(3)调整音频播放设备200的播放速度。In other examples, the preset threshold value B can also be set. When the absolute value of the slope 2 is less than the preset threshold B, it can be considered that the difference between the actual playback time point of the audio playback device 200 and the expected playback time point is small, and the playback speed of the audio playback device 200 does not need to be adjusted. When the absolute value of the slope 2 is greater than or equal to the preset threshold B, formula (3) is used to adjust the playback speed of the audio playback device 200 .
S610、播速模块向播放模块发送目标播放速度。S610. The playback speed module sends the target playback speed to the playback module.
S611、播放模块修改音频输出驱动的播放速度值为目标播放速度。S611. The playback module modifies the playback speed value of the audio output driver to the target playback speed.
例如,播放模块将音频输出驱动的播放速度调整至当前速度的(1+斜率2)倍。For example, the playback module adjusts the playback speed of the audio output driver to (1+slope 2) times the current speed.
后续,播放模块进行监测后续的音频分片的实际播放时间点和预计播放时间点的差值,判断差值是否大于预设阈值A,进一步采取相应的方法。Subsequently, the playback module monitors the difference between the actual playback time point of the subsequent audio segment and the expected playback time point, determines whether the difference is greater than the preset threshold A, and further adopts a corresponding method.
综上,根据音频播放设备200的实际播放时间点与预计播放时间点的差异,调整音频播放设备200的播放速度,达到与音频源设备100播放速度(或投放速度)一致,可以避免因两个设备的速度不同,造成音频播放设备200需要频繁增删缓存内的音频分片。To sum up, according to the difference between the actual playback time point of the audio playback device 200 and the expected playback time point, the playback speed of the audio playback device 200 is adjusted to be consistent with the playback speed (or delivery speed) of the audio source device 100. The speeds of the devices are different, causing the audio playback device 200 to frequently add or delete audio segments in the cache.
在本申请的又一些实施例中,可以将上述图4中所述的技术方案与图6所述的技术方案进行组合。也就是说,先通过音频播放设备200中缓存的音频分片的数量变化趋势,计算调整系数,根据调整系数计算每个音频分片的预计播放时间点,即对齐音频播放设备200与音频源设备100预计开始播放音频分片的时间点。而后,音频播放设备200还可以进一步调整自身的播放速度,以使得自身的播放速度与音频源设备100的播放速度一致,使得音频播放设备200能在播放进度上与音频源设备100长时间地保持一致。In still other embodiments of the present application, the technical solution described in FIG. 4 and the technical solution described in FIG. 6 may be combined. That is to say, first calculate the adjustment coefficient through the change trend of the number of audio fragments buffered in the audio playback device 200, and calculate the expected playback time point of each audio fragment according to the adjustment coefficient, that is, align the audio playback device 200 and the audio source device. 100 The point in time when the audio segment is expected to start playing. Then, the audio playback device 200 can further adjust its own playback speed, so that its own playback speed is consistent with the playback speed of the audio source device 100, so that the audio playback device 200 can maintain the playback progress with the audio source device 100 for a long time. Consistent.
在本申请的又一些实施例中,音频播放设备200还连接有其他的音频播放设备,例如音频播放设备300。由音频播放设备200和音频播放设备300共同播放音频内容。那么,除了音频播放设备200的播放速度要与音频源设备100的播放速度(或投放速度)保持一致外,音频播放设备300的播放速度也要与音频源设备100的播放速度(或投放速度)保持一致。In still other embodiments of the present application, the audio playback device 200 is further connected with other audio playback devices, for example, the audio playback device 300 . The audio content is played jointly by the audio playback device 200 and the audio playback device 300 . Then, in addition to the playback speed of the audio playback device 200 and the playback speed (or delivery speed) of the audio source device 100, the playback speed of the audio playback device 300 should also be consistent with the playback speed (or delivery speed) of the audio source device 100. be consistent.
图8示出了本申请提供的又一种音频播放方法的流程。如图8所示,该音频播放的方法包括步骤S401至步骤S406,以及步骤S801至步骤S805。FIG. 8 shows the flow of another audio playback method provided by the present application. As shown in FIG. 8 , the audio playback method includes steps S401 to S406 , and steps S801 to S805 .
其中,步骤S401至步骤S406请参考图4中流程的相关内容。这里重点说明与图4中的流程的不同之处。Wherein, for steps S401 to S406, please refer to the related content of the process in FIG. 4 . The differences from the flow in Figure 4 are highlighted here.
首先,音频播放设备200和音频播放设备300之间可以建立有线连接,也可以建立无线连接,其中无线连接的方式例如可以为蓝牙、WLAN、NFC等。First, a wired connection or a wireless connection may be established between the audio playback device 200 and the audio playback device 300, where the wireless connection may be, for example, Bluetooth, WLAN, NFC, or the like.
S801、音频播放设备200与音频播放设备300时间同步。S801, the audio playback device 200 is time synchronized with the audio playback device 300.
在一些示例中,音频播放设备200和音频播放设备300建立无线连接。在音频播放设备200和音频播放设备300在共同播放音频内容之前,音频播放设备200和音频播放设备300执行时间同步。例如,音频播放设备200可以在接收到音频流(即步骤S402)之后,或者,音频播放设备200向音频播放设备300发送音频分片(即在步骤S802)之后,或者,音频播放设备200向音频播放设备300发送启动音频播放(即步骤S804)之后,音频播放设备200与音频播放设备300执行时间同步。In some examples, audio playback device 200 and audio playback device 300 establish a wireless connection. Before the audio playback device 200 and the audio playback device 300 play audio content together, the audio playback device 200 and the audio playback device 300 perform time synchronization. For example, after the audio playback device 200 receives the audio stream (ie, step S402 ), or after the audio playback device 200 sends the audio segment to the audio playback device 300 (ie, at step S802 ), or the audio playback device 200 sends the audio After the playback device 300 sends to start the audio playback (ie, step S804 ), the audio playback device 200 and the audio playback device 300 perform time synchronization.
具体的,音频播放设备200(例如,具体是时间同步模块)可以采用简单网络时间协议(simple network time protocol,SNTP)或高精度时间同步协议(precision time protocol,PTP)等,与音频播放设备300(例如,具体是时间同步模块)执行时间同步。Specifically, the audio playback device 200 (for example, the time synchronization module in particular) may use a simple network time protocol (SNTP) or a precision time protocol (PTP), etc., to communicate with the audio playback device 300 (eg, specifically a time synchronization module) to perform time synchronization.
S802、音频播放设备200向音频播放设备300发送音频分片。S802: The audio playback device 200 sends the audio fragment to the audio playback device 300.
示例性的,在音频播放设备200的分发模块在执行完步骤S404c之后,一方面将得到的音频分片发送给自身的缓存模块(即步骤S404d),另一方面将得到的音频分片发送给音频播放设备300(例如,音频播放设备300的缓存模块)。Exemplarily, after the distribution module of the audio playback device 200 performs step S404c, on the one hand, the obtained audio fragment is sent to its own cache module (ie, step S404d), and on the other hand, the obtained audio fragment is sent to The audio playback device 300 (eg, the cache module of the audio playback device 300 ).
S803、音频播放设备300缓存音频分片。S803. The audio playback device 300 caches the audio fragment.
示例性的,音频播放设备300的缓存模块缓存接收到的音频分片。Exemplarily, the buffering module of the audio playback device 300 buffers the received audio fragments.
S804、音频播放设备200通知音频播放设备300启动音频播放。S804, the audio playback device 200 notifies the audio playback device 300 to start audio playback.
示例性的,音频播放设备200的分发模块确定当前时间晚于或等于第一个音频分片的预计播放时间点时,一方面通知自身的播放模块启动音频播放(即步骤S405a),另一方面通知音频播放设备300(例如,音频播放设备300的播放模块)启动音频播放。Exemplarily, when the distribution module of the audio playback device 200 determines that the current time is later than or equal to the expected playback time point of the first audio segment, on the one hand, it notifies its own playback module to start audio playback (ie, step S405a), and on the other hand The audio playback device 300 (eg, a playback module of the audio playback device 300 ) is notified to start audio playback.
S805、音频播放设备300启动音频播放。S805. The audio playback device 300 starts audio playback.
示例性的,音频播放设备300从自身的缓存模块中读取第一个音频分片以及之后的音频分片,开始播放。Exemplarily, the audio playback device 300 reads the first audio segment and subsequent audio segments from its own cache module, and starts playing.
需要注意的是,音频播放设备300接收到的音频分片中携带预计播放时间点,而该预计播放时间点是音频播放设备200根据音频源播放设备100和音频播放设备200之间的播放进度差异进行周期性更新的。It should be noted that the audio fragment received by the audio playback device 300 carries the expected playback time point, and the expected playback time point is the playback progress difference between the audio playback device 100 and the audio playback device 200 according to the audio source playback device 200 . Periodically updated.
与音频播放设备200的播放过程相同(可参考步骤S405b中的相关内容),音频播放设备300在播放音频分片时,会将音频分片写入音频输出驱动(例如ALSA),并且调用音频输出驱动的接口读取当前写入音频分片的扬声器输出时间点。其中,该扬声器输出时间点可认为是音频播放设备300的实际播放当前写入音频分片的时间点,简称为当前写入音频分片的实际播放时间点。并通过比对当前写入音频分片的实际播放时间点和预计播放时间点,删除或增加音频播放设备300中缓存模块的音频分片,以使得音频播放设备300的播放速度与音频源设备100的播放速度(或投放速度)保持一致。The same as the playback process of the audio playback device 200 (refer to the related content in step S405b), when the audio playback device 300 plays the audio segment, it will write the audio segment into the audio output driver (for example, ALSA), and call the audio output The driver's interface reads the speaker output time point currently written to the audio segment. The speaker output time point may be considered to be the time point when the audio playback device 300 actually plays the currently written audio segment, which is simply referred to as the actual playback time point of the currently written audio segment. And by comparing the actual playback time point and the expected playback time point of the currently written audio fragment, delete or increase the audio fragment of the cache module in the audio playback device 300, so that the playback speed of the audio playback device 300 is the same as that of the audio source device 100. The playback speed (or delivery speed) remains the same.
图9示出了本申请提供的又一种音频播放方法的流程。如图9所示,该方法包括步骤S401至步骤S404,步骤S405a,步骤S406,步骤S601至步骤S611,步骤S801至步骤S804,以及步骤901至步骤S911。FIG. 9 shows the flow of another audio playback method provided by the present application. As shown in FIG. 9 , the method includes steps S401 to S404, steps S405a, S406, S601 to S611, steps S801 to S804, and steps 901 to S911.
其中,本实施例中流程与图8中流程的不同之处在于,音频播放设备200记录各个音频分片的实际播放时间点与音频分片中携带的预计播放时间点的差异,并根据该差异调整音频播放设备200的播放速度,使得音频播放设备200的播放速度与音频源设备100的播放速度一致。即音频播放设备200执行步骤S601至步骤S611,具体内容可参考图6中流程的相关内容。The difference between the process in this embodiment and the process in FIG. 8 is that the audio playback device 200 records the difference between the actual playback time point of each audio fragment and the expected playback time point carried in the audio fragment, and according to the difference The playback speed of the audio playback device 200 is adjusted so that the playback speed of the audio playback device 200 is consistent with the playback speed of the audio source device 100 . That is, the audio playback device 200 executes steps S601 to S611, and the specific content can refer to the related content of the flow in FIG. 6 .
另外,与音频播放设备200的调整播放速度类似的,音频播放设备300也采用相似的方法调整自身的播放速度,使得音频播放设备300的播放速度也与音频源设备100的播放速度一致。即音频播放设备300执行步骤S901至步骤S911。在一些示例中,音频播放设备300还包括播速模块。In addition, similar to the adjustment of the playback speed of the audio playback device 200 , the audio playback device 300 also uses a similar method to adjust its own playback speed, so that the playback speed of the audio playback device 300 is also consistent with the playback speed of the audio source device 100 . That is, the audio playback device 300 executes steps S901 to S911. In some examples, the audio playback device 300 further includes a playback speed module.
S901、音频播放设备300的播放模块从缓存模块获取音频分片的内容。S901. The playback module of the audio playback device 300 acquires the content of the audio segment from the cache module.
S902、播放模块将获取的音频分片内容写入音频输出驱动。S902, the playback module writes the acquired audio segment content to the audio output driver.
S903、播放模块调用音频输出驱动的接口查询当前写入的音频分片的扬声器预期输出时间,即当前写入的音频分片的实际播放时间点。S903: The playback module invokes the interface of the audio output driver to query the speaker expected output time of the currently written audio segment, that is, the actual playback time point of the currently written audio segment.
当播放模块向音频输出驱动写入音频分片后,播放模块还调用音频输出驱动的接口查询当前写入的音频分片的扬声器预期输出时间,该音频分片的扬声器预期输出时 间可认为是该音频分片的实际播放时间点。After the playback module writes the audio segment to the audio output driver, the playback module also calls the interface of the audio output driver to query the speaker expected output time of the currently written audio segment, and the speaker expected output time of the audio segment can be considered to be the The actual playback time point of the audio segment.
S904、播放模块计算音频分片实际播放时间点和音频分片中携带的预计播放时间点(如playtime)的差值,并确定该差值是否大于预设阈值A。S904: The playback module calculates the difference between the actual playback time point of the audio fragment and the expected playback time point (eg, playtime) carried in the audio fragment, and determines whether the difference is greater than the preset threshold A.
S905、播放模块通知缓存模块删除或增加音频分片。S905, the playback module notifies the cache module to delete or add audio segments.
S906、播放模块向播速模块发送音频分片的实际播放时间点与其对应的差值。S906, the playback module sends the actual playback time point of the audio segment to the playback speed module and the difference value corresponding to it.
S907、播速模块存储音频分片的实际播放时间点与其对应的差值。S907: The playback speed module stores the difference between the actual playback time point of the audio segment and its corresponding difference.
S908、播速模块根据音频分片的实际播放时间点与其对应的差值,计算播放速度的偏差。S908, the playback speed module calculates the deviation of the playback speed according to the difference between the actual playback time point of the audio segment and its corresponding value.
S909、播速模块根据播放速度的偏差计算目标播放速度。S909, the playback speed module calculates the target playback speed according to the deviation of the playback speed.
S910、播速模块向播放模块发送目标播放速度。S910. The playback speed module sends the target playback speed to the playback module.
S911、播放模块修改音频输出驱动的播放速度值为目标播放速度。S911, the playback module modifies the playback speed value of the audio output driver to the target playback speed.
其中,步骤S901至步骤S911的具体内容可参考图6中步骤S601至步骤S611的相关内容,这里不再赘述。The specific contents of steps S901 to S911 may refer to the related contents of steps S601 to S611 in FIG. 6 , which will not be repeated here.
需要注意的是,步骤S901至步骤S911中计算得到的音频播放设备300的目标播放速度,与步骤S601至步骤S611中计算得到的音频播放设备200的目标播放速度相同或大致相同。It should be noted that the target playback speed of the audio playback device 300 calculated in steps S901 to S911 is the same or approximately the same as the target playback speed of the audio playback device 200 calculated in steps S601 to S611.
当然,在其他一些实施例中,音频播放设备200也可以采用现有的技术方案确定各个音频分片的预计播放时点,即不使用调整系数对各个音频分片的预计播放时间点进行调整,而是直接根据各个音频分片的实际播放时间点与预计播放时间点的差值,调整音频播放设备200的播放速度,直到与音频源设备的播放速度(或投放速度)保持一致。与此同时,音频播放设备300也可以直接根据自身的音频输出驱动确定各个音频分片的实际播放时间点,计算各个音频分片的实际播放时间点与预计播放时间点的差值,调整音频播放设备300的播放速度,直到与音频源设备的播放速度(或投放速度)保持一致。Of course, in some other embodiments, the audio playback device 200 may also use the existing technical solution to determine the expected playback time point of each audio segment, that is, without using adjustment coefficients to adjust the expected playback time point of each audio segment, Instead, the playback speed of the audio playback device 200 is adjusted directly according to the difference between the actual playback time point of each audio segment and the expected playback time point until it is consistent with the playback speed (or delivery speed) of the audio source device. At the same time, the audio playback device 300 can also directly determine the actual playback time point of each audio segment according to its own audio output driver, calculate the difference between the actual playback time point and the expected playback time point of each audio segment, and adjust the audio playback time. The playback speed of the device 300 until it is consistent with the playback speed (or delivery speed) of the audio source device.
在图8和图9中,音频播放设备200可以为主音箱或主耳机,音频播放设备300可以为从音箱或从耳机。可选地,音频播放设备200可以为主音箱,音频播放设备300可以为从耳机。可选地,音频播放设备200可以为主耳机,音频播放设备300可以为从音箱。In FIG. 8 and FIG. 9 , the audio playback device 200 may be a master speaker or a master headset, and the audio playback device 300 may be a slave speaker or a slave headset. Optionally, the audio playback device 200 may be a master speaker, and the audio playback device 300 may be a slave earphone. Optionally, the audio playback device 200 may be a master earphone, and the audio playback device 300 may be a slave speaker.
需要说明的是,音频播放设备不限于音箱、耳机等专门的音频播放设备,也可以为诸如带有扬声器的移动设备等的复合设备。It should be noted that the audio playback device is not limited to special audio playback devices such as speakers and earphones, and may also be a composite device such as a mobile device with speakers.
需要说明的是,本申请上述各实施例的任意特征的全部或部分,均可以自由组合所得到的技术方案,也在本申请的范围之内。It should be noted that, all or part of any features of the above-mentioned embodiments of the present application can be freely combined to obtain technical solutions, which are also within the scope of the present application.
本申请实施例还提供一种芯片系统。如图10所示,该芯片系统包括至少一个处理器2101和至少一个接口电路1102。处理器2101和接口电路1102可通过线路互联。例如,接口电路1102可用于从其它装置(例如音频播放设备200的存储器)接收信号。又例如,接口电路1102可用于向其它装置(例如处理器2101)发送信号。示例性的,接口电路1102可读取存储器中存储的指令,并将该指令发送给处理器2101。当所述指令被处理器2101执行时,可使得电子设备执行上述实施例中的音频播放设备200(比如,音箱)执行的各个步骤。当然,该芯片系统还可以包含其他分立器件,本申请实 施例对此不作具体限定。The embodiments of the present application also provide a chip system. As shown in FIG. 10 , the chip system includes at least one processor 2101 and at least one interface circuit 1102 . The processor 2101 and the interface circuit 1102 may be interconnected by wires. For example, the interface circuit 1102 may be used to receive signals from other devices (eg, the memory of the audio playback device 200). As another example, the interface circuit 1102 may be used to send signals to other devices (eg, the processor 2101). Exemplarily, the interface circuit 1102 may read the instructions stored in the memory and send the instructions to the processor 2101 . When the instructions are executed by the processor 2101, the electronic device can be made to execute various steps executed by the audio playback device 200 (eg, a sound box) in the above-mentioned embodiment. Of course, the chip system may also include other discrete devices, which are not specifically limited in this embodiment of the present application.
可以理解的是,上述终端等为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请实施例能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明实施例的范围。It can be understood that, in order to realize the above-mentioned functions, the above-mentioned terminal and the like include corresponding hardware structures and/or software modules for executing each function. Those skilled in the art should easily realize that, in conjunction with the units and algorithm steps of each example described in the embodiments disclosed herein, the embodiments of the present application can be implemented in hardware or a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of the embodiments of the present invention.
本申请实施例可以根据上述方法示例对上述终端等进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本发明实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In this embodiment of the present application, functional modules may be divided into the above terminal and the like according to the above method examples. For example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. It should be noted that, the division of modules in the embodiment of the present invention is schematic, and is only a logical function division, and there may be other division manners in actual implementation.
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。From the description of the above embodiments, those skilled in the art can clearly understand that for the convenience and brevity of the description, only the division of the above functional modules is used as an example for illustration. In practical applications, the above functions can be allocated as required. It is completed by different functional modules, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. For the specific working process of the system, apparatus and unit described above, reference may be made to the corresponding process in the foregoing method embodiments, and details are not described herein again.
在本申请实施例各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。Each functional unit in each of the embodiments of the embodiments of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:快闪存储器、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage The medium includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: flash memory, removable hard disk, read-only memory, random access memory, magnetic disk or optical disk and other media that can store program codes.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific embodiments of the present application, but the protection scope of the present application is not limited to this, and any changes or substitutions within the technical scope disclosed in the present application should be covered within the protection scope of the present application. . Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims (16)

  1. 一种音频播放方法,应用于第一音频播放设备,所述第一音频播放设备与所述音频源设备无线通信,其特征在于,所述方法包括:An audio playback method, applied to a first audio playback device that communicates wirelessly with the audio source device, wherein the method includes:
    接收到所述音频源设备发送的音频数据;receiving audio data sent by the audio source device;
    将所述音频数据划分为N个音频分片;dividing the audio data into N audio segments;
    缓存所述N个音频分片;其中,根据第一调整系数得到每个音频分片的预计播放时间点;Cache the N audio fragments; wherein, the expected playback time point of each audio fragment is obtained according to the first adjustment coefficient;
    依次播放每个音频分片;Play each audio segment in turn;
    周期性地采集缓存的音频分片的当前数量以及所述当前数量所对应的采集时间点;Periodically collect the current number of buffered audio fragments and the collection time point corresponding to the current number;
    在周期性地采集的时长达到预设时长后,或者,在周期性地采集的次数达到预设次数后,After the period of periodical collection reaches the preset period, or after the number of times of periodical collection reaches the preset number of times,
    根据每次采集的当前数量、以及每次采集的当前数量所对应的采集时间点,得到第二调整系数;根据所述第二调整系数得到后续每个音频分片的预计播放时间点;According to the current quantity collected each time and the collection time point corresponding to the current quantity collected each time, a second adjustment coefficient is obtained; according to the second adjustment coefficient, the expected playback time point of each subsequent audio fragment is obtained;
    依次播放后续的音频分片;Play subsequent audio segments in sequence;
    其中,N为大于等于2的正整数;所述第一调整系数为预设系数。Wherein, N is a positive integer greater than or equal to 2; the first adjustment coefficient is a preset coefficient.
  2. 根据权利要求1所述的方法,其特征在于,在接收到所述音频源设备发送的音频数据之前,所述方法还包括:The method according to claim 1, wherein before receiving the audio data sent by the audio source device, the method further comprises:
    接收到音频源设备发送的播放音频的指示。Received an instruction to play audio from the audio source device.
  3. 根据权利要求1或2所述的方法,其特征在于,所述根据每次采集的当前数量、以及每次采集的当前数量所对应的采集时间点,得到第二调整系数;包括:The method according to claim 1 or 2, wherein the obtaining the second adjustment coefficient according to the current quantity collected each time and the collection time point corresponding to the current quantity collected each time; comprising:
    对每次采集的当前数量、每次采集的当前数量所对应的采集时间点进行线性拟合,得到第一斜率;Perform linear fitting on the current quantity collected each time and the collection time point corresponding to the current quantity collected each time to obtain the first slope;
    根据所述第一斜率得到第二调整系数。A second adjustment coefficient is obtained according to the first slope.
  4. 根据权利要求1-3中任意一项所述的方法,其特征在于,所述周期性地采集缓存的音频分片的当前数量以及所述当前数量所对应的采集时间点;包括:The method according to any one of claims 1-3, wherein the periodically collecting the current number of buffered audio fragments and the collection time point corresponding to the current number; comprising:
    当任意一个音频分片的实际播放时间点和预计播放时间点两者的差值的绝对值大于第一阈值时,所述第一音频播放设备开始周期性地采集缓存的音频分片的当前数量以及所述当前数量所对应的采集时间点;When the absolute value of the difference between the actual playback time point and the expected playback time point of any audio segment is greater than the first threshold, the first audio playback device starts to periodically collect the current number of buffered audio segments and the collection time point corresponding to the current quantity;
    其中,所述音频分片的实际播放时间点为所述音频分片的扬声器预期输出时间点;所述音频分片的扬声器预期输出时间点是所述第一音频播放设备调用所述第一音频播放设备的音频输出驱动的接口查询得到。Wherein, the actual playback time point of the audio fragment is the expected output time point of the speaker of the audio fragment; the speaker expected output time point of the audio fragment is when the first audio playback device calls the first audio Obtained by querying the interface driven by the audio output of the playback device.
  5. 一种音频播放方法,应用于第一音频播放设备,所述第一音频播放设备与所述音频源设备无线通信,其特征在于,所述方法包括:An audio playback method, applied to a first audio playback device that communicates wirelessly with the audio source device, wherein the method includes:
    接收到所述音频源设备发送的音频数据;receiving audio data sent by the audio source device;
    将所述音频数据划分为N个音频分片;dividing the audio data into N audio segments;
    缓存所述N个音频分片;其中,根据第一调整系数得到每个音频分片的预计播放时间点;Cache the N audio fragments; wherein, the expected playback time point of each audio fragment is obtained according to the first adjustment coefficient;
    依次播放每个音频分片;Play each audio segment in turn;
    在所述音频分片的实际播放时间点与预计播放时间点的差值的绝对值大于预设阈 值后,调整缓存的音频分片的数量;After the absolute value of the difference between the actual playback time point of the audio fragment and the expected playback time point is greater than the preset threshold, adjust the number of buffered audio fragments;
    其中,所述音频分片的实际播放时间点为所述音频分片的扬声器预期输出时间点;所述音频分片的扬声器预期输出时间点通过所述第一音频播放设备调用所述第一音频播放设备的音频输出驱动的接口查询得到;N为大于等于2的正整数;所述第一调整系数为预设系数。Wherein, the actual playback time point of the audio fragment is the expected output time point of the speaker of the audio fragment; the speaker expected output time point of the audio fragment calls the first audio through the first audio playback device Obtained by querying the interface of the audio output driver of the playback device; N is a positive integer greater than or equal to 2; and the first adjustment coefficient is a preset coefficient.
  6. 根据权利要求5所述的方法,其特征在于,所述在所述音频分片的实际播放时间点与预计播放时间点的差值的绝对值大于预设阈值后,调整缓存的音频分片的数量;包括:The method according to claim 5, wherein after the absolute value of the difference between the actual playback time point of the audio fragment and the expected playback time point is greater than a preset threshold, adjusting the buffered audio fragment Quantity; includes:
    在所述音频分片的实际播放时间点与预计播放时间点的差值的绝对值大于所述预设阈值,且所述差值为负值后,增加第一数量的音频分片;After the absolute value of the difference between the actual playback time point of the audio fragment and the expected playback time point is greater than the preset threshold, and the difference is a negative value, the first number of audio fragments is added;
    在所述音频分片的实际播放时间点与预计播放时间点的差值的绝对值大于所述预设阈值,且所述差值为正值后,删除第一数量的音频分片。After the absolute value of the difference between the actual playback time point of the audio segment and the expected playback time point is greater than the preset threshold, and the difference is a positive value, the first number of audio segments are deleted.
  7. 根据权利要求6所述的方法,其特征在于,所述第一数量关联于所述差值的绝对值除以所述音频分片播放时长的商。The method according to claim 6, wherein the first quantity is associated with the quotient of the absolute value of the difference divided by the playback duration of the audio segment.
  8. 根据权利要求7所述的方法,其特征在于,所述第一数量为所述差值的绝对值除以所述音频分片播放时长的商。The method according to claim 7, wherein the first quantity is the quotient of the absolute value of the difference divided by the playback duration of the audio segment.
  9. 根据权利要求6-8中任意一项所述的方法,其特征在于,增加的所述第一数量的音频分片为静音数据。The method according to any one of claims 6-8, wherein the added first number of audio segments are mute data.
  10. 根据权利要求5-9中任意一项所述的方法,其特征在于,在所述音频分片的实际播放时间点与预计播放时间点的差值的绝对值小于或等于所述预设阈值后,调整所述第一音频播放设备的播放速度。The method according to any one of claims 5-9, wherein after the absolute value of the difference between the actual playback time point of the audio segment and the expected playback time point is less than or equal to the preset threshold , and adjust the playback speed of the first audio playback device.
  11. 根据权利要求10所述的方法,其特征在于,在所述音频分片的实际播放时间点与预计播放时间点的差值的绝对值小于或等于所述预设阈值后,The method according to claim 10, wherein after the absolute value of the difference between the actual playback time point of the audio segment and the expected playback time point is less than or equal to the preset threshold,
    采集所述音频分片的实际播放时间点,以及所述音频分片的实际播放时间点与预计播放时间点的差值;Collect the actual playback time point of the audio fragment, and the difference between the actual playback time point of the audio fragment and the expected playback time point;
    在采集的次数达到预设次数后,或者,在采集的时长达到预设时长后,After the number of collections reaches the preset number of times, or, after the collection time period reaches the preset time period,
    对每次采集的实际播放时间点,每次采集的实际播放时间点对应的所述差值进行线性拟合,得到第二斜率;Perform linear fitting on the actual playback time point collected each time and the difference corresponding to the actual playback time point collected each time to obtain a second slope;
    获取所述第一音频播放设备的当前播放速度;Obtain the current playback speed of the first audio playback device;
    根据所述当前播放速度和所述第二斜率,得到调整后的播放速度;Obtain the adjusted playback speed according to the current playback speed and the second slope;
    以所述调整后的播放速度,依次播放后续的音频分片。Play subsequent audio segments in sequence at the adjusted playback speed.
  12. 根据权利要求5-11中任意一项所述的方法,其特征在于,所述第一音频播放设备连接有第二音频播放设备,所述方法还包括:The method according to any one of claims 5-11, wherein the first audio playback device is connected with a second audio playback device, and the method further comprises:
    向所述第二音频播放设备发送所述N个音频分片。Send the N audio segments to the second audio playback device.
  13. 根据权利要求12所述的方法,其特征在于,在所述第一音频播放设备播放第一个音频分片之前,所述方法还包括:The method according to claim 12, wherein before the first audio playback device plays the first audio segment, the method further comprises:
    向所述第二音频播放设备发送开始播放音频分片的指示。Send an instruction to start playing the audio segment to the second audio playback device.
  14. 一种第一音频播放设备,其特征在于,包括处理器、音频输出装置以及存储器,所述音频输出装置和所述存储器都与所述处理器耦合,所述存储器用于存储计算 机程序,当所述计算机程序被所述处理器执行时,使得所述第一音频播放设备执行如权利要求1-13中任意一项所述的方法。A first audio playback device, characterized in that it includes a processor, an audio output device, and a memory, wherein the audio output device and the memory are both coupled to the processor, and the memory is used to store a computer program, when all the When the computer program is executed by the processor, the first audio playback device is caused to perform the method according to any one of claims 1-13.
  15. 一种计算机可读存储介质,其特征在于,包括计算机程序,当所述计算机程序在第一音频播放设备上运行时,使得所述第一音频播放设备执行如权利要求1-13中任一项所述的方法。A computer-readable storage medium, comprising a computer program that, when the computer program runs on a first audio playback device, causes the first audio playback device to perform any one of claims 1-13 the method described.
  16. 一种计算机程序产品,其特征在于,当计算机程序产品在计算机上运行时,使得计算机执行如权利要求1-13中任一项所述的方法。A computer program product, characterized in that, when the computer program product is run on a computer, the computer is caused to execute the method according to any one of claims 1-13.
PCT/CN2021/136897 2021-02-27 2021-12-09 Audio playback method, device and system WO2022179246A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110221879.2A CN114974321B (en) 2021-02-27 2021-02-27 Audio playing method, equipment and system
CN202110221879.2 2021-02-27

Publications (1)

Publication Number Publication Date
WO2022179246A1 true WO2022179246A1 (en) 2022-09-01

Family

ID=82974161

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/136897 WO2022179246A1 (en) 2021-02-27 2021-12-09 Audio playback method, device and system

Country Status (2)

Country Link
CN (1) CN114974321B (en)
WO (1) WO2022179246A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117707752A (en) * 2023-05-31 2024-03-15 荣耀终端有限公司 Method for eliminating pop sound in audio, electronic equipment and readable storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115866309B (en) * 2022-11-29 2023-09-22 广州后为科技有限公司 Audio and video caching method and device supporting multipath video synchronization
CN115629733B (en) * 2022-12-20 2023-03-28 翱捷科技(深圳)有限公司 Audio playing method, chip, system and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105812902A (en) * 2016-03-17 2016-07-27 联发科技(新加坡)私人有限公司 Data play method, device and system
CN108495239A (en) * 2018-01-17 2018-09-04 深圳聚点互动科技有限公司 Method, apparatus, equipment and the storage medium that more equipment room audio precise synchronizations play
CN109918038A (en) * 2019-01-14 2019-06-21 珠海慧联科技有限公司 A kind of audio broadcasting speed synchronous method and system
US20190215349A1 (en) * 2016-09-14 2019-07-11 SonicSensory, Inc. Multi-device audio streaming system with synchronization
CN110134362A (en) * 2019-05-16 2019-08-16 北京小米移动软件有限公司 Audio frequency playing method, device, playback equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107801080A (en) * 2017-11-10 2018-03-13 普联技术有限公司 A kind of audio and video synchronization method, device and equipment
CN111918093B (en) * 2020-08-13 2021-10-26 腾讯科技(深圳)有限公司 Live broadcast data processing method and device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105812902A (en) * 2016-03-17 2016-07-27 联发科技(新加坡)私人有限公司 Data play method, device and system
US20190215349A1 (en) * 2016-09-14 2019-07-11 SonicSensory, Inc. Multi-device audio streaming system with synchronization
CN108495239A (en) * 2018-01-17 2018-09-04 深圳聚点互动科技有限公司 Method, apparatus, equipment and the storage medium that more equipment room audio precise synchronizations play
CN109918038A (en) * 2019-01-14 2019-06-21 珠海慧联科技有限公司 A kind of audio broadcasting speed synchronous method and system
CN110134362A (en) * 2019-05-16 2019-08-16 北京小米移动软件有限公司 Audio frequency playing method, device, playback equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117707752A (en) * 2023-05-31 2024-03-15 荣耀终端有限公司 Method for eliminating pop sound in audio, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN114974321B (en) 2023-11-03
CN114974321A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
WO2022179246A1 (en) Audio playback method, device and system
US11375312B2 (en) Method, device, loudspeaker equipment and wireless headset for playing audio synchronously
TWI627861B (en) Method for playback data and apparapus and system thereof
CN110636600B (en) Audio synchronous playing method for wireless equipment
CN117596516A (en) Audio playing system
CN110636349B (en) Audio synchronous playing method for wireless equipment
JP2004007140A (en) Voice reproducing device and voice reproduction control method to be used for the same device
WO2020211535A1 (en) Network delay control method and apparatus, electronic device, and storage medium
WO2007086564A1 (en) Broadcast station synchronization method and control device
CN112637102B (en) Audio interface circuit, control method thereof and audio equipment
WO2017000371A1 (en) Method, apparatus and system for adjusting output of bluetooth device, and storage medium
CN113613221A (en) TWS master device, TWS slave device, audio device and system
WO2023273601A1 (en) Audio synchronization method, audio playback device, audio source, and storage medium
US11689690B2 (en) Method and device for audio and video synchronization
WO2024093570A1 (en) Screen projection method and apparatus
CN113613125B (en) Audio synchronous control method, device, audio equipment and system
CN112527237B (en) Audio interface circuit, control method thereof and audio equipment
WO2024046120A1 (en) Communication apparatus, and communication synchronization method and system
US11979705B2 (en) Bluetooth earphone adaptive audio playback speed
CN115776628B (en) Method for accurate synchronization of two-ear recording of TWS Bluetooth headset
WO2023273591A1 (en) Audio synchronization method, audio playing device and storage medium
US20240153522A1 (en) Audio playback method, electronic device, and computer-readable storage medium
WO2024059458A1 (en) Synchronization of head tracking data
CN115604810A (en) Synchronous playing method, system and storage medium for wireless audio equipment
CN113535115A (en) Audio playing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21927670

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21927670

Country of ref document: EP

Kind code of ref document: A1