WO2023015404A1 - Procédé de lecture d'un audio, appareil, dispositif électronique et support d'enregistrement - Google Patents

Procédé de lecture d'un audio, appareil, dispositif électronique et support d'enregistrement Download PDF

Info

Publication number
WO2023015404A1
WO2023015404A1 PCT/CN2021/111435 CN2021111435W WO2023015404A1 WO 2023015404 A1 WO2023015404 A1 WO 2023015404A1 CN 2021111435 W CN2021111435 W CN 2021111435W WO 2023015404 A1 WO2023015404 A1 WO 2023015404A1
Authority
WO
WIPO (PCT)
Prior art keywords
time
audio data
decoding
offset
playback
Prior art date
Application number
PCT/CN2021/111435
Other languages
English (en)
Chinese (zh)
Inventor
张金梁
Original Assignee
深圳Tcl新技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳Tcl新技术有限公司 filed Critical 深圳Tcl新技术有限公司
Priority to PCT/CN2021/111435 priority Critical patent/WO2023015404A1/fr
Publication of WO2023015404A1 publication Critical patent/WO2023015404A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/04Synchronising
    • H04N5/06Generation of synchronising signals

Definitions

  • the present application relates to the technical field of display control, in particular to an audio playback method, device, electronic equipment and storage medium.
  • the audio player After the audio player receives the audio data, it needs to perform back-end data processing such as decoding on the audio data. Therefore, a certain decoding time needs to be reserved for the processing of the audio data before playing the audio data.
  • Embodiments of the present application provide an audio playback method, device, electronic equipment, and storage medium, which can increase the time reserved for back-end decoding processing and realize synchronization of audio playback.
  • the embodiment of the present application provides an audio playback method, including:
  • the audio data set includes at least one frame of audio data
  • the audio data is played according to the terminal device.
  • an audio playback device including:
  • a decoding module configured to decode an audio data set to be played on the terminal device, where the audio data set includes at least one frame of audio data;
  • a recording module configured to record the start decoding reference time and the actual decoding processing time of the audio data decoding process of each frame
  • An acquisition module configured to acquire a reference offset time and an expected playback time corresponding to the audio data
  • An adjustment module configured to adjust the decoding start reference time according to the reference offset time for each frame of audio data, to obtain the decoding start adjustment time
  • a determination module configured to determine a reference playback time of the audio data according to the start decoding adjustment time and the actual decoding processing time
  • a timing module configured to update the reference playing time according to time changes
  • a playing module configured to play the audio data according to the terminal device when the adjusted reference playing time reaches a desired playing time.
  • the audio data includes current audio data
  • the obtaining module includes:
  • a sampling unit configured to sample at least one frame of historical audio data, where the historical audio data is of the same data type as the current audio data;
  • the first acquisition unit is used to respectively acquire the historical reference playback time and the historical expected playback time after decoding the historical audio data of each frame;
  • a first determining unit configured to respectively determine the offset time of each frame of the historical audio data according to the historical reference playing time and the historical expected playing time;
  • the second determining unit is configured to determine a reference offset time of the current audio data according to at least one of the offset times.
  • the second determining unit includes:
  • a first determining subunit configured to determine an average value of at least one offset time to obtain an average offset time
  • the second determining subunit is configured to determine the reference offset time of the current audio data according to the average offset time.
  • the second determination unit is specifically further configured to:
  • the second determination unit is specifically further configured to:
  • the sampling unit is also specifically used for:
  • the historical audio data Used to acquire historical audio data according to a preset frequency, the historical audio data including at least one frame;
  • the reference offset time includes an actual offset time
  • the adjustment module includes:
  • a second acquiring unit configured to acquire the reserved processing time of the audio data
  • a third determining unit configured to determine the actual offset time of the audio data according to the reserved processing time and the actual decoding processing time
  • the third obtaining unit is configured to adjust the decoding start reference time according to the actual offset time to obtain the decoding start adjustment time.
  • the second acquiring unit includes:
  • the third determining subunit is configured to determine the reserved processing time of the audio data according to the reference time of starting decoding of the audio data and the expected playing time.
  • the second acquiring unit is specifically further configured to:
  • the recording module is specifically used to:
  • It is used for determining the actual decoding processing time of the audio data according to the decoding start time and the decoding end time.
  • the determination module is specifically used to:
  • It is used for determining the reference playing time of the audio data according to the basic reference playing time and the end decoding time.
  • the acquisition module is specifically used to:
  • the obtaining module is also specifically used to:
  • the mapping relationship set includes a mapping relationship between a preset audio data type and a preset reference offset time
  • the method is used to acquire the reference offset time corresponding to the audio data according to the mapping relationship set and the audio data type.
  • the reference playback time changes with time, and after the change reaches the desired playback time, the audio data is played, that is, when the reference playback time of the audio data meets the desired playback time After the time, the audio data is played synchronously; the reference playback time is reduced according to the reduction of the reference time for decoding, and the actual decoding processing time of the audio data is not affected or limited. Therefore, this kind of audio data
  • the corresponding reference playback time reduction method provides enough decoding time for the decoding process of audio data, and avoids audio and picture asynchrony, frame loss, or freezes in audio data playback due to too long decoding processing time. Phenomenon.
  • FIG. 1 is a schematic diagram of a scene of an audio playback method provided by the present application
  • Fig. 2 is a schematic flow chart of the audio playing method provided by the present application.
  • Fig. 3 is another schematic flow chart of the audio playing method provided by the present application.
  • Fig. 4 is another schematic flow chart of the audio playback method provided by the present application.
  • FIG. 5 is a schematic structural diagram of an audio playback device provided by the present application.
  • FIG. 6 is a schematic structural diagram of an electronic device provided by the present application.
  • Embodiments of the present application provide an audio playback method, device, electronic equipment, and storage medium.
  • the audio playback method provided by the embodiment of the present application can be performed by an electronic device, where the electronic device includes a terminal device or a server; where the terminal device can be a TV, a mobile phone, a notebook, a desktop or a tablet computer, etc.; the server can be It is an independent physical server, or it can be a server cluster or distributed system composed of multiple physical servers, and it can also provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, and middleware services , domain name service, security service, content delivery network (CDN, ContentDelivery Network), and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms, etc.
  • the servers can be directly or indirectly connected through wired or wireless communication.
  • the terminal device and the server jointly execute the audio playback method as an example, wherein data transmission can be performed between the terminal device and the server through a wired network, a wireless network, or broadcasting, wherein, between the terminal device and the server Basically, other devices can also be added to assist in completing the audio playback method, and the types of other devices are not limited here; wherein, the specific process of the terminal device and the server jointly executing the audio playback method is as follows:
  • the terminal device receives the audio data set (audio data packet) to be played from the server, and then the processor in the terminal device decodes the received audio data set, and records each frame of audio data to start the decoding process The decoding reference time and the actual decoding processing time of each frame of audio data.
  • the terminal device also obtains the reference offset time and expected playback time of each frame of audio data from the audio data. Then, for each frame of audio data, according to the reference The offset time adjusts the start decoding reference time to obtain the start decoding adjustment time, and then, according to the start decoding adjustment time and the actual decoding processing time, determine the reference playback time after the audio data decoding process is completed, and finally, at the reference playback time.
  • the terminal device plays the audio data; wherein, if the terminal device is a TV, the audio data can be played directly without using an additional audio player.
  • the reference time of the audio data (such as the reference time of starting decoding) is synchronized with the reference clock on the terminal device, and the reference clock is coordinated and synchronized with the accurate time according to the Network Time Protocol (NTP, Network Time Protocol).
  • NTP Network Time Protocol
  • the source of accurate time is Coordinated Universal Time (UTC, Universal Time Coordinated), for example, the Beijing time displayed by the reference clock on the terminal device, when the terminal device starts decoding audio data at 6:00 Beijing time, the reference time to start decoding is also 6:00;
  • the reference playback time is the time when the audio data can be played after decoding.
  • the reference playback time is also synchronized with the Beijing time, that is, the Beijing time after the audio data is decoded is the reference playback time of the audio data.
  • the reference clock The timing of the clock is updated according to the data transmission, and there is time consumption in the transmission of data, so there is a time error in the timing of the reference clock. Therefore, in the embodiment of the present application, the reference clock is determined based on the reference time of starting decoding and the actual decoding processing time of decoding.
  • the actual decoding processing time is timed according to the feedback of the crystal vibration, so the time record is very accurate, therefore, you can get an actual reference playing time after the audio data decoding processing, at this time, get The timestamp of the reference playback time is no longer synchronized with the timestamp of the reference clock, but the reference playback time still increases according to the time (which can be Beijing time).
  • the reference playback time is At 8 o'clock, the Beijing time can be 8:01 at this time, that is, the time stamps of the reference broadcast time and Beijing time are no longer the same;
  • the audio data when the reference playback time of the audio data is equal to the expected playback time, the audio data can be played, but the decoding process of the audio data takes a certain amount of time, and the reference playback time of the audio data after decoding often exceeds the corresponding expectation of the audio data. Playing time, therefore, it is necessary to call back the reference playing time after audio data decoding to make it synchronized with the expected playing time or synchronize with the expected playing time after the reference playing time increases with time, and then play the audio data.
  • NTP Network Time Protocol
  • server or clock source such as quartz clock, GPS, etc.
  • high-precision time correction on LAN with The difference between the standard is less than 1 millisecond, tens of milliseconds on the WAN).
  • NTP can obtain UTC time from atomic clocks, observatories, satellites, or from the Internet.
  • the expected playback time is the time when the audio data is expected to be displayed, such as the display time tag (PTS, Presentation Time Stamp).
  • PTS Display Time Tag
  • the audio data can be played, that is, the audio
  • the expected playback time of the data is synchronized with the international standard time.
  • the expected playback time of the audio data is later than the reference playback time of the terminal device, the audio data cannot be played accurately. Dun and so on.
  • the reference playback time changes with time, and after the change reaches the desired playback time, the audio data is played, that is, when the reference playback time of the audio data meets the desired playback time After the time, the audio data is played synchronously; the reference playback time is reduced according to the reduction of the reference time for decoding, and the actual decoding processing time of the audio data is not affected or limited. Therefore, this kind of audio data
  • the corresponding reference playback time reduction method provides enough decoding time for the decoding process of audio data, and avoids audio and picture asynchrony, frame loss, or freezes in audio data playback due to too long decoding processing time. Phenomenon.
  • FIG. 2 is a schematic flowchart of an audio playback method provided by an embodiment of the present application.
  • the specific process of the audio playback method can be as follows:
  • the audio data received by the terminal device to be played is often coded and compressed data.
  • the transmitted data packet TLV is the ISDBS3 standard data packet
  • the TLV packet corresponding to the 4K ultra-high-definition program is generally
  • the data volume of the program content is very large. Therefore, in order to improve the transmission efficiency, it is generally necessary to encode and compress the audio data.
  • the audio data received by the terminal device is compressed data that needs to be decoded. The terminal device can only decode the audio data after decoding the audio data. Play audio data.
  • the decoding start reference time is synchronized with the reference clock on the terminal device, that is, when the audio data starts to be decoded, the time of the reference clock is the audio data decoding start reference time;
  • the actual decoding processing time of the audio data is the actual time used in the audio data decoding process.
  • the actual decoding processing time of the audio data can be accurately timed according to the hardware timing device, that is, optionally, in some embodiments, the step "recording the actual decoding processing time of the audio data decoding processing of each frame" may specifically include :
  • Timing feedback is performed according to the beating of the crystal oscillator, and the start decoding time and the end decoding time of the audio data are obtained respectively;
  • the actual decoding processing time of the audio data is determined according to the decoding start time and the decoding end time.
  • timing can be performed by crystal oscillator feedback, such as an oscillator, when the terminal device is turned on, it starts timing from 0:00:00 seconds, and the timing process is fed back by the 27M crystal oscillator beating.
  • crystal oscillator feedback such as an oscillator
  • using an oscillator for timing is more accurate.
  • timing is performed when the audio data starts to be decoded, and the start decoding time is obtained.
  • the end decoding time of the audio data is obtained, and the audio data can be obtained according to the difference between the end decoding time and the start decoding time. Actual decoding processing time for decoding.
  • Adjustment that is, adjusting the data of the reference playback time referenced by the audio data playback, rather than adjusting the data corresponding to the reference clock, that is, not adjusting the display time on the terminal device;
  • the reference offset time can be determined according to the difference between the reference play time and the expected play time of the audio data, and the reference play time is rolled back to the above difference, so that the adjusted reference play time is less than or equal to the expected play time.
  • the adjustment of the reference time is also a millisecond difference that is difficult for the human body to perceive.
  • the original reference playback time of audio data is 7:0500ms, that is, the terminal device reference clock
  • the display time of the clock is 7:00
  • the reference offset time is 200ms. Therefore, the adjusted reference playback time is 7:0300ms.
  • the display time of the reference clock is still 7:00, and the millisecond level of 200ms Adjustment is difficult for users to perceive, but the time of 200ms is difficult to ignore for audio decoding.
  • the method of shifting time adjusts the reference playback time so that the audio data can be played, and provides enough time for audio data decoding, reducing the performance requirements for hardware products, and it is also difficult for users to perceive this part of the time difference.
  • the reference offset time of the current audio data can be determined according to the offset of the historical data in the past, that is, optionally, in some embodiments, the step "acquire the reference offset time corresponding to the audio data Offset time", which can specifically include:
  • the historical reference playing time and the historical expected playing time respectively determine the offset time of each frame of the historical audio data
  • a reference offset time of the current audio data is determined according to at least one of the offset times.
  • the reference offset time that the reference playback time of the current audio data should be adjusted can be inferred, because the sampling is the same historical audio data type as the current audio data Therefore, the acquisition of the reference offset time is more accurate.
  • the historical audio data to be sampled can be replaced according to a certain frequency, that is, optionally, in some embodiments, the step "sampling at least one frame of historical audio data", Specifically can include:
  • the historical audio data including at least one frame
  • the reference offset of the current audio data may be determined by averaging the offset times corresponding to the historical audio data, that is, optionally, in some embodiments, the step "according to at least one of the offset times , to determine the reference offset time of the current audio data", which may specifically include:
  • a reference offset time of the current audio data is determined according to the average offset time.
  • the average value indicates the overall offset of the historical audio data of the same data type as the current audio data, and the overall offset can reflect the range to be adjusted for the reference playback time of the current audio data. Therefore, the historical audio data
  • the average offset time is used as the reference offset time of the current audio data, which has certain accuracy and reference value.
  • the step of "determining the reference offset time of the current audio data according to at least one of the offset times" may specifically include:
  • the average integer offset time is determined as a reference offset time of the current audio data.
  • the method of rounding can be a method of converting zeros into wholes, for example, adjusting the value after the decimal point to 1, so that the adjusted reference playback time is less than the expected playback time, so as to ensure that the audio data can be played smoothly.
  • the reference offset time of an integer can intuitively know the interval to be adjusted of the reference playback time of the audio data.
  • the decoding time is at the level of milliseconds, in order to improve the calculation efficiency, the time below the unit of milliseconds can be ignored, that is, only the integer part of each offset time can be used as a consideration for the selection of the reference offset time, and each The offset time is rounded first, and the average value of these offset times is calculated, that is, optionally, in some embodiments, the step "determine the reference offset of the current audio data according to at least one of the offset times shift time", which may specifically include:
  • the integer average offset time is determined as a reference offset time of the current audio data.
  • each offset time can be rounded up to zero, for example, the data after the decimal point of each offset data is adjusted to 1, that is, the integer part is increased by 1 ms.
  • the reference offset time indicates the basic offset data or the overall offset of the historical audio data of the same data type as the current audio data. Therefore, the adjustment of the reference playback time by the reference offset time can basically meet the requirements of the reference The playing time is adjusted to be less than or equal to the expected playing time.
  • the above-mentioned reference offset time is obtained on the basis of taking the average value, and the average value of each audio data offset time represents the overall offset situation of the audio data set, therefore, the obtained by the above method
  • the reference offset time can satisfy the adjustment requirement of the reference playback time to a certain extent.
  • the error offset that still exists in the audio data based on the adjustment of the reference offset time based on the actual offset time corresponding to the audio data on the basis of the reference offset time. shift time, adjust the reference play time of the audio data simultaneously according to the reference offset time and the error offset time, that is, optionally, in some embodiments, the step "according to the reference offset time to start decoding The reference time is adjusted to obtain the start decoding adjustment time", which may include:
  • the decoding start reference time is adjusted according to the reference offset time and the error offset time to obtain an adjustment start decoding time.
  • the reference offset time is calculated by the average value of the offset time of historical audio data, so, according to the actual offset time and the reference offset time to determine the error offset time, and then, after the reference playback time is adjusted according to the reference offset time, further adjustments are made according to the error offset time, which can improve the accuracy of the reference playback time adjustment of the audio data , that is, ensure that the adjusted reference playback time is less than or equal to the expected playback time.
  • the actual audio data corresponding to each frame of audio can be determined, and the error offset can be determined according to the actual audio data and the reference offset time Time, then, on the basis of adjusting the reference playback time according to the reference offset time, further adjustment is made according to the error offset time.
  • the reference offset time also includes the actual offset time
  • the step of "adjusting the decoding start reference time according to the reference offset time to obtain the decoding start adjustment time" may specifically include:
  • the decoding start reference time is adjusted according to the actual offset time to obtain the decoding start adjustment time.
  • the audio data decoding start reference time is adjusted according to the actual offset time of the audio data.
  • the reserved processing time is the processing time reserved for the decoding of audio data, wherein, the acquisition of the reserved processing time can be obtained according to the decoding start reference time and the expected playback time, that is, optional, in some embodiments , the step of "obtaining the reserved processing time of the audio data" may specifically include:
  • the reserved processing time of the audio data is determined according to the decoding start reference time of the audio data and the expected playing time.
  • the reserved processing time of audio data can be obtained according to the difference between the expected playback time and the reference time of starting decoding, and the actual deviation of audio data decoding can be obtained by comparing the reserved processing time with the actual decoding processing time. shift time.
  • the expected playing time of the audio data may be directly extracted from the data information of the audio data, that is, optionally, in some embodiments, the step "obtaining the expected playing time corresponding to the audio data" may specifically include:
  • the expected playing time of the audio data is directly extracted from the audio data.
  • the reference offset time of the audio data may also be determined according to the type of the audio data, that is, optionally, in some embodiments, the step of "acquiring the reference offset time corresponding to the audio data" may specifically include :
  • mapping relationship set includes a mapping relationship between a preset audio data type and a preset reference offset time
  • mapping relationship set can be determined according to the relationship between the type of historical audio data and the offset time in the past, for example, according to the statistics of the type of historical audio data and the offset time, comprehensively determine the reference offset corresponding to the data type of the audio data shift time.
  • the decoding reference time before the audio data decoding process and the reference playback time after the decoding process are on the same timeline, that is, the change of the reference playback time will bring about the change of the reference playback time, so the decoding of the start
  • the adjustment of the decoding reference time at the start of processing is equivalent to the adjustment of the reference playback time after decoding processing.
  • FIG. 3 is a schematic flow diagram of reference playback time adjustment in the embodiment of the present application.
  • the reference time for starting decoding when the audio data starts to be processed can be adjusted.
  • the final reference playback time after audio data processing is also changed and adjusted accordingly, which is equivalent to updating and adjusting the final reference playback time after starting to decode the reference time adjustment, as follows:
  • 111 Obtain the audio data set, that is, the audio data packet (TLV stream);
  • the hardware device playing audio data can obtain a reference play time that is less than or equal to the expected play time, that is, after the reference time to start decoding is adjusted, in The reference playback time obtained by the hardware device after decoding is the adjusted reference playback time.
  • the sum of the start decoding reference time of the audio data and the actual decoding processing time is the reference time after the decoding of the audio data. Due to referring to the offset time of the historical audio data, the start reference time of the current audio data is adjusted to make the audio data The reference playing time after the data is decoded is less than or equal to the expected playing time of the audio data.
  • the reference playing time of the audio data may be determined according to the time information before the audio data decoding processing and the time information after the decoding processing, that is, optionally, in some embodiments, the step "according to the start decoding and adjusting the time and the The actual decoding processing time determines the reference playing time of the audio data", which may specifically include:
  • a reference playback time of the audio data is determined according to the basic reference playback time and the end decoding time.
  • the difference between the start decoding adjustment time and the start decoding time can obtain the basic reference playback time of the audio data, and the sum of the basic reference playback time and the end decoding time can reach the reference playback time of the audio data, that is, the reference playback time and the start time
  • the difference in decoding adjustment time is the difference between the end decoding time and the start decoding time.
  • the reference playing time is acquired by the basic reference playing time, so that the adjustment of the reference offset time can be adjusted when the decoding of the audio data starts, that is, the reference time of starting decoding of the audio data is adjusted.
  • the adjusted decoding start adjustment time can synchronously act on the final reference playback time.
  • the reference playback time is the reference playback time of the audio data, and when the reference playback time reaches the desired playback time, the audio data can be played;
  • the reference playback time of the current audio data can be inferred according to the offset time of the historical audio data
  • the reference playback time is called back according to the reference offset time, so that the reference playback time is not kept within the time of the reference clock Sync, but adjust to a time that is less than or equal to the desired playback time.
  • the audio data can be played, wherein, since the audio data is decoded at the millisecond level, so The time difference between the adjusted reference playback time and the expected playback time should also be at the millisecond level.
  • the difference between the reference playing time corresponding to each audio data after adjustment is also relatively fixed, and can still be Realize continuous playback of audio.
  • Obtain the audio data set that is, the audio data packet (TLV stream)
  • step 122 Obtain the timestamp of the expected playback time of each frame of audio data from the audio data set, and check the timestamp of the expected playback time to judge the validity of the timestamp data. If it is valid, enter step 123; otherwise, continue to step 122. 122;
  • step 123 Record the start decoding time of audio data decoding, and check the timestamp of the start decoding time to judge the validity of the timestamp data. If it is valid, enter step 124, otherwise, continue to step 123;
  • step 127 Determine the error offset time of the reference playback time according to the reference offset time and the actual offset time, and continue to adjust the basic reference playback time according to the error offset time on the basis of adjusting the reference playback time (essentially or adjust the reference time for starting decoding), obtain the basic reference playback adjustment time, and check the validity of the basic reference playback adjustment time, if invalid, proceed to step 127, and if valid, proceed to step 128;
  • the adjustment process first adjust the reference time for starting decoding according to the reference offset time, and then determine the adjusted error offset time according to the actual decoding processing time; then, continue to adjust according to the error offset time,
  • the accuracy of adjusting the reference time to start decoding can be improved, that is, the accuracy of the final reference playback time of audio data can be improved.
  • the reference playback time changes with time, and after the change reaches the desired playback time, the audio data is played, that is, when the reference playback time of the audio data meets the desired playback time After the time, the audio data is played synchronously; the reference playback time is reduced according to the reduction of the reference time for decoding, and the actual decoding processing time of the audio data is not affected or limited. Therefore, this kind of audio data
  • the corresponding reference playback time reduction method provides enough decoding time for the decoding process of audio data, and avoids audio and picture asynchrony, frame loss, or freezes in audio data playback due to too long decoding processing time. Phenomenon.
  • the present application also provides an audio playing device based on the above audio playing method.
  • the meanings of the nouns are the same as those in the above audio playing method, and for specific implementation details, please refer to the description in the method embodiments.
  • FIG. 5 is a schematic structural diagram of an audio playback device provided by the present application, wherein the audio playback device may include a decoding module 201, a recording module 202, an acquisition module 203, an adjustment module 204, a determination module 205, a timing module 206 and Play module 207, specifically can be as follows:
  • the decoding module 201 is configured to decode an audio data set to be played on a terminal device, where the audio data set includes at least one frame of audio data.
  • the audio data received by the terminal device to be played is often coded and compressed data.
  • the transmitted data packet TLV is the ISDBS3 standard data packet
  • the TLV packet corresponding to the 4K ultra-high-definition program is generally
  • the data volume of the program content is very large. Therefore, in order to improve the transmission efficiency, it is generally necessary to encode and compress the audio data.
  • the audio data received by the terminal device is compressed data that needs to be decoded. The terminal device can only decode the audio data after decoding the audio data. Play audio data.
  • the recording module 202 is configured to record the decoding reference time and the actual decoding processing time of the audio data decoding processing of each frame.
  • the decoding start reference time is synchronized with the reference clock on the terminal device, that is, when the audio data starts to be decoded, the time of the reference clock is the audio data decoding start reference time;
  • the actual decoding processing time of the audio data is the actual time used in the audio data decoding process.
  • the recording module 202 is specifically used to:
  • It is used for determining the actual decoding processing time of the audio data according to the decoding start time and the decoding end time.
  • this A timing device based on the beating of the crystal oscillator is applied to realize the timing of the actual decoding time of the audio data, and the timing processing using the beating feedback of the crystal oscillator is more accurate.
  • An acquisition module 203 configured to acquire a reference offset time and an expected playback time corresponding to the audio data
  • Adjustment that is, adjusting the data of the reference playback time referenced by the audio data playback, rather than adjusting the data corresponding to the reference clock, that is, not adjusting the display time on the terminal device;
  • the reference offset time can be determined according to the difference between the reference play time and the expected play time of the audio data, and the reference play time is rolled back to the above difference, so that the adjusted reference play time is less than or equal to the expected play time.
  • the adjustment of the reference time is also a millisecond difference that is difficult for the human body to perceive.
  • the original reference playback time of audio data is 7:0500ms, that is, the reference clock
  • the display time of the clock is 7:00
  • the reference offset time is 200ms. Therefore, the adjusted reference playback time is 7:0300ms.
  • the display time of the reference clock is still 7:00, and the millisecond level of 200ms Adjustment is difficult for users to perceive, but the time of 200ms is difficult to ignore for audio decoding.
  • the method of shifting time adjusts the reference playback time so that the audio data can be played, and provides enough time for audio data decoding, reducing the performance requirements for hardware products, and it is also difficult for users to perceive this part of the time difference.
  • the audio data includes current audio data
  • the obtaining module 203 includes:
  • a sampling unit configured to sample at least one frame of historical audio data, where the historical audio data is of the same data type as the current audio data;
  • the first acquisition unit is used to respectively acquire the historical reference playback time and the historical expected playback time after decoding the historical audio data of each frame;
  • a first determining unit configured to respectively determine the offset time of each frame of the historical audio data according to the historical reference playing time and the historical expected playing time;
  • the second determining unit is configured to determine a reference offset time of the current audio data according to at least one of the offset times.
  • the reference offset time that the reference playback time of the current audio data should be adjusted can be inferred, because the sampling is the same historical audio data type as the current audio data Therefore, the acquisition of the reference offset time is more accurate.
  • the second determining unit includes:
  • a first determining subunit configured to determine an average value of at least one offset time to obtain an average offset time
  • the second determining subunit is configured to determine the reference offset time of the current audio data according to the average offset time.
  • the average value indicates the overall offset of the historical audio data of the same data type as the current audio data, and the overall offset can reflect the range to be adjusted for the reference playback time of the current audio data. Therefore, the historical audio data
  • the average offset time is used as the reference offset time of the current audio data, which has certain accuracy and reference value.
  • the second determination unit is specifically further configured to:
  • the second determination unit is specifically further configured to:
  • the sampling unit is also specifically used for:
  • the historical audio data Used to acquire historical audio data according to a preset frequency, the historical audio data including at least one frame;
  • the acquiring module 203 is specifically used to:
  • the acquiring module 203 is specifically further configured to:
  • the mapping relationship set includes a mapping relationship between a preset audio data type and a preset reference offset time
  • the method is used to acquire the reference offset time corresponding to the audio data according to the mapping relationship set and the audio data type.
  • the adjustment module 204 is configured to adjust the decoding start reference time according to the reference offset time for each frame of audio data to obtain the decoding start adjustment time.
  • the adjustment of the start decoding reference time can be adjusted according to the reference offset time, and can also be adjusted according to the actual offset time, or on the basis of the adjustment of the reference offset time, the error offset time can be adjusted to ensure the reference playback Accuracy of time adjustments.
  • the reference offset time includes an actual offset time
  • the adjustment module 204 includes:
  • a second acquiring unit configured to acquire the reserved processing time of the audio data
  • a third determining unit configured to determine the actual offset time of the audio data according to the reserved processing time and the actual decoding processing time
  • the third obtaining unit is configured to adjust the decoding start reference time according to the actual offset time to obtain the decoding start adjustment time.
  • the second acquiring unit includes:
  • the third determining subunit is configured to determine the reserved processing time of the audio data according to the reference time of starting decoding of the audio data and the expected playback time.
  • the second acquiring unit is specifically further configured to:
  • a determining module 205 configured to determine a reference playback time of the audio data according to the start decoding adjustment time and the actual decoding processing time;
  • the determination module 205 is specifically used to:
  • It is used for determining the reference playing time of the audio data according to the basic reference playing time and the end decoding time.
  • a timing module 206 configured to update the reference playing time according to time changes
  • the playing module 207 is configured to play the audio data according to the terminal device when the adjusted reference playing time reaches a desired playing time.
  • the audio data acquired by the terminal device is firstly decoded by the decoding module 201.
  • the recording module 202 records the decoding reference playback time before the audio data decoding process and the actual decoding processing time of the decoding process.
  • the expected playing time of each frame of audio data and the corresponding reference offset time of each frame of audio data are extracted from the audio data by the acquisition module 203, and then the adjustment module 204 adjusts the reference playback time for starting decoding according to the reference offset time ( turn down), so that the start decoding reference play time is no longer synchronized with the time of the reference clock, and then, the determination module 204 determines the reference play time after the decoding process of the audio data according to the start decoding adjustment time and the actual decoding processing time (that is, the reference The playback time is determined according to the start decoding adjustment time and the actual decoding processing time, rather than being synchronized with the time of the reference clock), at the same time, the timing module 206 re-times the adjusted reference playback time, when the re-timed reference playback After the current time corresponding to the time is equal to the expected playback time of the audio data, the corresponding audio data is played by the playback module 207; in the embodiment of the present application, the reference playback time is changed with
  • this method of reducing the reference playback time corresponding to the audio data provides sufficient decoding time for the decoding processing of the audio data , to avoid audio and video out-of-sync, frame loss, or stuttering in the playback of audio data due to the long decoding processing time.
  • FIG. 6 shows a schematic structural diagram of the electronic device involved in the present application, specifically:
  • the electronic device may include a processor 401 of one or more processing cores, a memory 402 of one or more computer-readable storage media, a power supply 403, an input unit 404 and other components.
  • a processor 401 of one or more processing cores may include a processor 401 of one or more processing cores, a memory 402 of one or more computer-readable storage media, a power supply 403, an input unit 404 and other components.
  • FIG. 6 does not constitute a limitation on the electronic device, and may include more or less components than shown in the figure, or combine some components, or arrange different components. in:
  • the processor 401 is the control center of the electronic device, and uses various interfaces and lines to connect various parts of the entire electronic device, by running or executing software programs and/or modules stored in the memory 402, and calling the Data, perform various functions of electronic equipment and process data, so as to monitor electronic equipment as a whole.
  • the processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor and a modem processor, wherein the application processor mainly processes operating systems, user interfaces, and application programs, etc. , the modem processor mainly handles wireless communications. It can be understood that the foregoing modem processor may not be integrated into the processor 401 .
  • the memory 402 can be used to store software programs and modules, and the processor 401 executes various functional applications and decoding processing by running the software programs and modules stored in the memory 402 .
  • the memory 402 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program (such as a sound playback function, an image playback function, etc.) required by at least one function; Data created by the use of electronic devices, etc.
  • the memory 402 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage devices.
  • the memory 402 may further include a memory controller to provide the processor 401 with access to the memory 402 .
  • the electronic device also includes a power supply 403 for supplying power to various components.
  • the power supply 403 can be logically connected to the processor 401 through a power management system, so that functions such as charging, discharging, and power consumption management can be implemented through the power management system.
  • the power supply 403 may also include one or more DC or AC power supplies, recharging systems, power failure detection circuits, power converters or inverters, power status indicators and other arbitrary components.
  • the electronic device can also include an input unit 404, which can be used to receive input numbers or character information, and generate keyboard, mouse, joystick, optical or trackball signal input related to user settings and function control.
  • an input unit 404 which can be used to receive input numbers or character information, and generate keyboard, mouse, joystick, optical or trackball signal input related to user settings and function control.
  • the electronic device may also include a display unit, etc., which will not be repeated here.
  • the processor 401 in the electronic device loads the executable file corresponding to the process of one or more application programs into the memory 402 according to the following instructions, and the processor 401 runs the executable file stored in the The application program in memory 402, thereby realizes various functions, as follows:
  • the audio data set includes at least one frame of audio data; record the start decoding reference time and actual decoding processing time of each frame of the audio data decoding process; obtain the audio The reference offset time and expected playback time corresponding to the data; for each frame of audio data, adjust the reference time to start decoding according to the reference offset time to obtain the adjustment time to start decoding; adjust the time according to the start decoding and The actual decoding processing time determines the reference playback time of the audio data; the reference playback time is updated according to time changes; when the adjusted reference playback time reaches the expected playback time, according to the terminal device Play the above audio data.
  • the reference playback time of the audio data By reducing the reference playback time of the audio data, the reference playback time changes with time, and after the change reaches the expected playback time, the audio data is played, that is, when the reference playback time of the audio data meets the expected playback time, the The audio data is played synchronously; the reference playback time is reduced according to the reduction of the decoding reference time, and the actual decoding processing time of the audio data is not affected or limited. Therefore, this kind of audio data corresponds to the reference
  • the method of reducing the playback time provides enough decoding time for the decoding process of the audio data, and avoids audio and picture asynchrony, frame loss or freezes in the playback of audio data due to too long decoding processing time.
  • the present application provides a storage medium in which a plurality of instructions are stored, and the instructions can be loaded by a processor to execute the steps in any audio playback method provided in the present application.
  • the command can perform the following steps:
  • the audio data set includes at least one frame of audio data; record the start decoding reference time and actual decoding processing time of each frame of the audio data decoding process; obtain the audio The reference offset time and expected playback time corresponding to the data; for each frame of audio data, adjust the reference time to start decoding according to the reference offset time to obtain the adjustment time to start decoding; adjust the time according to the start decoding and The actual decoding processing time determines the reference playback time of the audio data; the reference playback time is updated according to time changes; when the adjusted reference playback time reaches the expected playback time, according to the terminal device Play the above audio data.
  • the storage medium may include: read-only memory (ROM, Read Only Memory), random access memory (RAM, Random Access Memory), disk or CD, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)

Abstract

Un procédé de lecture d'un audio, un appareil, un dispositif électronique et un support d'enregistrement sont divulgués dans des modes de réalisation de la présente demande. Un mode de réalisation consiste à : décoder un jeu de données audio à lire sur un équipement terminal, le jeu de données audio comprenant au moins une trame de données audio; enregistrer un temps de référence de décodage de début et un temps de décodage réel pour décoder chaque trame de données audio; obtenir un temps de lecture attendu et un temps de décalage de référence correspondant à des données audio; ajuster le temps de référence de décodage de début pour chaque trame de données audio selon le temps de décalage de référence, et obtenir un temps ajusté de décodage de début; déterminer un temps de lecture de référence de données audio selon le temps ajusté de décodage de début et le temps de décodage réel; ajuster le temps de lecture de référence selon un changement temporel; et lire des données audio selon l'équipement terminal lors que le temps de lecture de référence ajusté atteint le temps de lecture attendu. Le temps de lecture audio ajusté des données audio est prévu pour être inférieur ou égal au temps de lecture attendu, et des données audio sont prévues pour pouvoir effectuer une lecture synchrone.
PCT/CN2021/111435 2021-08-09 2021-08-09 Procédé de lecture d'un audio, appareil, dispositif électronique et support d'enregistrement WO2023015404A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/111435 WO2023015404A1 (fr) 2021-08-09 2021-08-09 Procédé de lecture d'un audio, appareil, dispositif électronique et support d'enregistrement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/111435 WO2023015404A1 (fr) 2021-08-09 2021-08-09 Procédé de lecture d'un audio, appareil, dispositif électronique et support d'enregistrement

Publications (1)

Publication Number Publication Date
WO2023015404A1 true WO2023015404A1 (fr) 2023-02-16

Family

ID=85199741

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/111435 WO2023015404A1 (fr) 2021-08-09 2021-08-09 Procédé de lecture d'un audio, appareil, dispositif électronique et support d'enregistrement

Country Status (1)

Country Link
WO (1) WO2023015404A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060282774A1 (en) * 2005-06-10 2006-12-14 Michele Covell Method and system for improving interactive media response systems using visual cues
US20210065749A1 (en) * 2019-09-04 2021-03-04 Sagemcom Broadband Sas Method of decoding an incoming audio/video stream

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060282774A1 (en) * 2005-06-10 2006-12-14 Michele Covell Method and system for improving interactive media response systems using visual cues
US20210065749A1 (en) * 2019-09-04 2021-03-04 Sagemcom Broadband Sas Method of decoding an incoming audio/video stream
CN112449233A (zh) * 2019-09-04 2021-03-05 萨基姆宽带联合股份公司 解码传入的音频/视频系统的方法

Similar Documents

Publication Publication Date Title
US20240137202A1 (en) Method and apparatus for time synchronisation in wireless networks
US20200068248A1 (en) Dynamic Control of Fingerprinting Rate to Facilitate Time-Accurate Revision of Media Content
US8233648B2 (en) Ad-hoc adaptive wireless mobile sound system
US8321593B2 (en) Time synchronization of media playback in multiple processes
US10856018B2 (en) Clock synchronization techniques including modification of sample rate conversion
CN109089130B (zh) 一种调整直播视频的时间戳的方法和装置
EP1570368B1 (fr) Système de distribution de contenu par fourniture de flots de données
CN103200461B (zh) 一种多台播放终端同步播放系统及播放方法
US11812103B2 (en) Dynamic playout of transition frames while transitioning between playout of media streams
KR20210030478A (ko) 대체 컨텐츠의 종료를 피대체 컨텐츠의 종료에 맞춰 정렬하는 것을 지원하기 위한 대체 컨텐츠 재생의 동적 감소
US9621682B2 (en) Reduced latency media distribution system
TW201802700A (zh) 用於控制同步資料流之系統及方法
US20160005439A1 (en) Systems and methods for networked media synchronization
TWI507022B (zh) 多媒體串流的緩存輸出方法以及多媒體串流緩存模組
CN111147906A (zh) 同步播放系统及同步播放方法
CN108495239A (zh) 多设备间音频精确同步播放的方法、装置、设备及存储介质
CN109040819A (zh) 播放进度同步方法、装置、设备以及存储介质
KR20210078985A (ko) 복수의 커넥티드 장치 간 디지털 콘텐츠의 재생 동기화를 맞추는 방법 및 이를 이용한 장치
CN104618737B (zh) 流媒体系统时钟慢同步方法及其装置
CN108259998A (zh) 播放器及播放控制方法、装置、电子设备及播放系统
WO2023015404A1 (fr) Procédé de lecture d'un audio, appareil, dispositif électronique et support d'enregistrement
JP4742836B2 (ja) 受信装置
CN108696762A (zh) 一种同步播放方法、装置和系统
WO2023273601A1 (fr) Procédé de synchronisation audio, dispositif de lecture audio, source audio et support de stockage
KR102131741B1 (ko) 다수의 사이니지의 영상 동기화 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21953043

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2024507963

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE