WO2022179306A1 - Procédé et appareil de lecture audio/vidéo, et dispositif électronique - Google Patents

Procédé et appareil de lecture audio/vidéo, et dispositif électronique Download PDF

Info

Publication number
WO2022179306A1
WO2022179306A1 PCT/CN2021/143557 CN2021143557W WO2022179306A1 WO 2022179306 A1 WO2022179306 A1 WO 2022179306A1 CN 2021143557 W CN2021143557 W CN 2021143557W WO 2022179306 A1 WO2022179306 A1 WO 2022179306A1
Authority
WO
WIPO (PCT)
Prior art keywords
playback
audio
video image
data
video
Prior art date
Application number
PCT/CN2021/143557
Other languages
English (en)
Chinese (zh)
Inventor
罗诚
李刚
周凡
向宇
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022179306A1 publication Critical patent/WO2022179306A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/462Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs

Definitions

  • the present application relates to the technical field of intelligent terminals, and in particular, to an audio and video playback method, device, and electronic device.
  • a common application scenario is the audio and video separation application scenario.
  • Different collection devices are used to collect audio data and video image data respectively, and different collection devices are used to play audio data and video image data respectively.
  • the mobile phone cooperates with the surrounding large screen, camera, microphone, and speaker to realize the video telephony service with the remote end.
  • the audio data and video image data are respectively transmitted from the data output end to the audio playback end (eg, a smart speaker) and the video image playback end (eg, a large-screen TV) for playback.
  • the audio playback end eg, a smart speaker
  • the video image playback end eg, a large-screen TV
  • unfavorable factors such as transmission errors, unstable transmission signal strength, transmission delay, etc. These unfavorable factors will cause playback freezes and playback delays in audio playback and video image playback. This greatly reduces the user experience.
  • the present application provides an audio and video playback method, device and electronic device; the present application also provides an audio playback method, device and electronic device; The present application also provides a video image playback method, apparatus and electronic device; the present application also provides a computer-readable storage medium.
  • the present application provides a method for playing audio and video, including:
  • the chasing strategy is configured as a playback strategy for audio playback and/or video image playback, wherein the chasing strategy includes:
  • the playback delay of the image playback compared to the audio and video data output is less than or equal to the preset interactive scene playback delay threshold.
  • the audio and video playback scenarios are identified, and the corresponding playback strategy is selected according to the specific application scenarios, which can greatly improve the user experience of audio and video playback.
  • using a catch-up strategy for audio and video playback in interactive application scenarios can ensure that the playback delay of audio and video playback meets the application requirements of interactive application scenarios, thereby greatly improving the user experience of interactive application scenarios. experience.
  • the method further includes:
  • a smooth playback strategy is configured as a playback strategy for the audio playback and/or the video image playback.
  • the catch-up strategy is used to play audio and video for non-interactive application scenarios, which can ensure that the smoothness of audio and video playback meets the application requirements of non-interactive application scenarios, thereby greatly improving the user experience of interactive application scenarios.
  • the configuration catch-up strategy is a playback strategy for audio playback and/or video image playback, including:
  • the chasing strategy is configured as a playing strategy of the audio playing and the video image playing.
  • the configuration catch-up strategy is a playback strategy for audio playback and/or video image playback, including:
  • the configuration synchronization strategy is the playback strategy of the video image playback, wherein the synchronization strategy includes:
  • the video image playback is adjusted so that the playback progress of the video image playback is synchronized with the playback progress of the audio playback.
  • the configuration catch-up strategy is a playback strategy for audio playback and/or video image playback, including:
  • the configuration synchronization strategy is the playback strategy of the audio playback, wherein the synchronization strategy includes:
  • the audio playback is adjusted based on the playback progress of the video image playback, so that the playback progress of the audio playback is synchronized with the playback progress of the video image playback.
  • an audio playback method comprising:
  • the audio playback is performed based on the interactive scene playback delay threshold, and during the execution of the audio playback, the audio playback is adjusted so that the playback progress of the audio playback catches up with the output progress of the audio and video data, so that the audio playback progresses
  • the playback delay of audio playback compared to the audio and video data output is less than or equal to the preset interactive scene playback delay threshold.
  • the audio playback based on the interactive scene playback delay threshold includes:
  • the audio playback is adjusted so that the playback delay of the audio playback is less than or equal to the interactive scene playback delay threshold.
  • the audio playback based on the interactive scene playback delay threshold includes:
  • the data buffer amount of the unplayed data of the audio playback exceeds the preset data buffer threshold
  • the data buffer amount of the unplayed data of the audio playback is adjusted so that the data buffer amount of the unplayed data of the audio playback is less than Equal to the preset data cache threshold.
  • the adjusting the audio playback includes:
  • All or part of the unplayed data in the audio buffer is deleted, so that the audio playback skips the deleted unplayed data, so that the playback progress of the audio playback catches up with the output progress of the audio and video data.
  • the deletion of all or part of the unplayed data in the audio buffer includes:
  • the present application provides a video image playback method, comprising:
  • the video image playback is performed based on the interactive scene playback delay threshold, and during the execution of the video image playback, the video image playback is adjusted so that the playback progress of the video image playback catches up with the audio and video data output progress, So that the playback delay of the video image compared to the audio and video data output is less than or equal to a preset interactive scene playback delay threshold.
  • the video image playback based on the interactive scene playback delay threshold includes:
  • the video image playback is adjusted so that the playback delay of the video image playback is less than or equal to the interactive scene playback delay threshold.
  • the video image playback based on the interactive scene playback delay threshold includes:
  • the adjusting the video image playback includes:
  • All or part of the unplayed data in the video image buffer is deleted, so that the video image playback skips the deleted unplayed data, so that the playback progress of the video image playback catches up with the audio and video data output progress.
  • the present application provides an audio and video playback device, the device comprising:
  • a scene identification module which is used to identify the current application scene
  • a playback strategy configuration module configured to configure the chasing strategy as a playback strategy for audio playback and/or video image playback when the current application scenario is an interactive application scenario, wherein the chasing strategy includes:
  • the playback delay of the image playback compared to the audio and video data output is less than or equal to the preset interactive scene playback delay threshold.
  • the present application provides an audio playback device, the device comprising:
  • a threshold acquisition module which is used to acquire a preset interactive scene playback delay threshold when the playback strategy of audio playback is a catch-up strategy
  • a playback adjustment module which is used to adjust the audio playback in the process of performing the audio playback, so that the playback progress of the audio playback catches up with the progress of the audio and video data output, so that the audio playback is compared with the audio and video data.
  • the output playback delay is less than or equal to the preset interactive scene playback delay threshold.
  • the present application provides a video image playback device, the device comprising:
  • a threshold acquisition module which is used to acquire a preset interactive scene playback delay threshold when the playback strategy of video image playback is a catch-up strategy
  • a playback adjustment module which is used to adjust the playback of the video image in the process of playing the video image, so that the playback progress of the video image playback catches up with the progress of the audio and video data output, so that the video image playback is relatively
  • the playback delay of the audio and video data output is less than or equal to the preset interactive scene playback delay threshold.
  • the present application provides an audio and video playback device, the device comprising:
  • a scene identification module which is used to identify the current application scene
  • a playback strategy configuration module which is used to configure a playback strategy for audio playback and/or video image playback according to the current application scenario, including: when the current application scenario is an interactive application scenario, configuring a catch-up strategy for audio playback and/or Or a playback strategy for video image playback, wherein the chasing strategy includes: adjusting the audio playback and/or the video image playback, so that the playback progress of the audio playback and/or the video image playback chases the audio and video data The progress of the output, so that the playback delay of the audio playback and/or the video image playback compared to the audio and video data output is less than or equal to a preset interactive scene playback delay threshold;
  • a threshold acquisition module configured to acquire a preset interactive scene playback delay threshold when the playback strategy of the audio playback and/or the video image playback is a catch-up strategy
  • a playback adjustment module which is used to adjust the audio playback and/or the video image playback in the process of performing the audio playback and/or the video image playback, so that the audio playback and/or the video image playback
  • the playback progress of playing catches up with the progress of the audio and video data output, so that the playback delay of the audio playback and/or the video image playback compared to the audio and video data output is less than or equal to a preset interactive scene playback delay threshold.
  • the application provides an audio and video playback system, the system comprising:
  • An audio and video output device which is used to output audio data and video data respectively; and, identify the current application scene, when the current application scene is an interactive application scene, configure the chasing strategy to play audio playback and/or video image playback strategy, wherein the chasing strategy includes: adjusting the audio playback and/or the video image playback, so that the playback progress of the audio playback and/or the video image playback catches up with the audio and video data output progress, so that The playback delay of the audio playback and/or the video image playback compared to the audio and video data output is less than or equal to a preset interactive scene playback delay threshold;
  • An audio playback device configured to receive the audio data, and play the audio data according to a playback strategy of the audio playback configured by the audio and video output device;
  • a video image playback device configured to receive the video image data, and play the video image data according to a playback strategy of the video image playback configured by the audio and video output device.
  • the application provides an audio and video playback system, the system comprising:
  • an audio output device for outputting audio data
  • a video image output device for outputting video image data
  • Audio and video playback device which is used for:
  • Identifying the current application scenario when the current application scenario is an interactive application scenario, configure a catch-up strategy as a playback strategy for audio playback and/or video image playback, wherein the catch-up strategy includes: adjusting the audio playback and/or The video image is played, so that the playback progress of the audio playback and/or the video image playback catches up with the progress of the audio and video data output, so that the audio playback and/or the video image playback is compared with the audio and video data.
  • the output playback delay is less than or equal to the preset interactive scene playback delay threshold;
  • the audio data and the video image data are played according to the playback strategy of the audio playback and the playback strategy of the video image playback.
  • the present application provides an electronic device, the electronic device comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein when the computer program instructions are executed by the processor, triggering
  • the electronic device performs the method steps as described in the first aspect above.
  • the present application provides an electronic device comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein when the computer program instructions are executed by the processor, The electronic device is triggered to perform the method steps described in the second aspect above.
  • the present application provides an electronic device comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein when the computer program instructions are executed by the processor, The electronic device is triggered to perform the method steps described in the third aspect above.
  • the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when it runs on a computer, the computer causes the computer to execute the above-mentioned first to third aspects. 's method.
  • Fig. 1 is a schematic diagram of an application scenario of audio and video data transmission respectively;
  • Figure 2 is a schematic diagram of an application scenario of audio and video data transmission respectively
  • Fig. 3 shows the flow chart of audio and video data playing respectively
  • FIG. 4 is a schematic diagram of an application scenario where audio and video data are respectively transmitted
  • FIG. 5 is a flowchart of an audio and video playback method according to an embodiment of the present application.
  • FIG. 6 shows a structural block diagram of an audio and video playback system according to an embodiment of the present application
  • FIG. 7 is a structural block diagram of an audio and video playback system according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of an application scenario where audio and video data are transmitted respectively
  • FIG. 9 shows a flowchart of an audio playback method according to an embodiment of the present application.
  • FIG. 10 shows a flowchart of an audio playback method according to an embodiment of the present application.
  • Figure 11 shows a schematic diagram of the AAC audio data decoding process
  • FIG. 12 is a flowchart of a video image playback method according to an embodiment of the present application.
  • FIG. 13 shows a flowchart of a video image playback method according to an embodiment of the present application
  • FIG. 14 shows a flowchart of an audio and video playback method according to an embodiment of the present application
  • FIG. 15 shows a flowchart of an audio and video playback method according to an embodiment of the present application
  • FIG. 16 is a partial flowchart of a method for playing audio and video according to an embodiment of the present application.
  • FIG. 1 shows a schematic diagram of an application scenario of transmitting audio and video data respectively.
  • the drone 11 collects video images for the residential building, and transmits the collected video image data to the mobile phone 13 ;
  • the microphone 12 performs audio collection for the community (for example, to collect the current environment of the shooting area of the drone 11 ) audio, collect the on-site explanation voice of the tour guide for the shooting area of the drone 11 ), and transmit the collected audio data to the mobile phone 13 .
  • the mobile phone 13 plays the audio data collected by the microphone 12 while playing the video image data collected by the UAV 11 , so that the user of the mobile phone 13 can take a virtual tour of the cell.
  • the drone 11 continuously collects video image data (for example, A11, A12, A13...) and sends it to the mobile phone 13 in sequence; the microphone 12 continuously collects audio data (for example, B11, B12, B13) synchronized with the video image data Certainly and sent to the mobile phone 13 in turn.
  • video image data for example, A11, A12, A13
  • audio data for example, B11, B12, B13
  • the mobile phone 13 plays video image data and audio data in the order in which the data is received.
  • the mobile phone 13 will also continuously receive the video image data, so as to perform continuous video image playback; however, if If the video image data transmission delay is unstable, the mobile phone 13 cannot continuously receive the video image data, and the video image playback of the mobile phone 13 may freeze or skip frames.
  • the audio data is sent, the mobile phone 13 will also continuously receive the audio data, thereby performing continuous audio playback; however, if the audio data transmission delay is unstable, the mobile phone 13 cannot continuously receive the audio data, and the audio of the mobile phone 13 Stuttering, skipping, etc. may occur during playback.
  • FIG. 2 shows a schematic diagram of an application scenario of transmitting audio and video data respectively.
  • the drone 21 collects video images for the residential building, and transmits the collected video image data to the mobile phone 23 ; the microphone 22 performs audio collection for the community (for example, to collect the current environment of the shooting area of the drone 21 )
  • the audio data collected by the tour guide is transmitted to the mobile phone 23 .
  • the mobile phone 23 transmits video image data to the large-screen TV 25 for playback; at the same time, the mobile phone 23 also transmits audio data to the smart speaker 24 for playback. In this way, the user can take a virtual tour of the community through the cooperation of the large-screen TV 25 and the smart speaker 24 .
  • the mobile phone 23 continuously transmits video image data (eg, A21, A22, A23 . . . ) to the large-screen TV 25 in sequence.
  • the mobile phone 23 continuously sends the audio data (eg, B21, B22, B23 . . . ) synchronized with the video image data to the smart speaker 24 in sequence.
  • the large-screen television 25 plays video image data in the order in which the data is received. During this process, if the transmission delay of the video image data is stable, when the mobile phone 23 continuously transmits the video image data, the large-screen TV 25 will also continuously receive the video image data, thereby performing continuous video image playback; however, if The video image data transmission delay is unstable, the large-screen TV 25 cannot continuously receive video image data, and the video image playback of the large-screen TV 25 may freeze or skip frames.
  • the smart speaker 24 plays audio data in the order in which the data is received. If the audio data transmission delay is stable, when the mobile phone 23 continuously transmits the audio data, the smart speaker 24 will also continuously receive the audio data, thereby performing continuous audio playback; however, if the audio data transmission delay is unstable, the smart speaker 24 If the audio data cannot be received continuously, the audio playback of the smart speaker 24 will be stuck, skipped, and the like.
  • an embodiment of the present application provides a method for playing audio and video based on a cache. Specifically, after receiving the audio data and video data, the receiving device for audio data and video data does not play the audio data and video data immediately, but temporarily buffers the audio data and video data, and after buffering a certain amount of data, plays the buffered data. data.
  • FIG. 3 shows a flow chart of playing audio and video data respectively. As shown in Figure 3, in the process of playing video image data:
  • Step 311 the data output terminal outputs video image data
  • Step 312 the playback terminal receives video image data
  • Step 314 The playback end plays the video image data in the buffer according to the buffer order in the buffer.
  • Step 321 the data output terminal outputs audio data
  • Step 322 the player receives audio data
  • Step 324 The playback end plays the audio data in the buffer according to the buffer order in the buffer.
  • the data output terminal for audio playback and the data output terminal for video image playback may be the same device or different devices; the playback terminal for audio playback and the playback terminal for video image playback may be the same device, or Can be for different devices.
  • the data buffering amount can be dynamically changed according to the data transmission delay fluctuation; or, a maximum data buffering amount can be set according to the maximum value of the data delay fluctuation.
  • the player plays the video image data or audio data in the cache according to the cache order in the cache, the player receives the time node of a certain video image data or a certain audio data, and the player plays a
  • the time nodes of the video image data or the audio data may be different time nodes.
  • the larger the data volume of buffered audio data or video image data the less interference caused by fluctuations in data transmission delay during subsequent playing. For example, if the audio data and video image data of a certain video are all cached and then played, then in the playback process, there will be no problems of freezing and frame skipping caused by fluctuations in data transmission delay.
  • the smart speaker 24 performs audio playback according to the arrangement order in the cache queue (in the order of B21, B22, and B23). Before being played, the larger the amount of buffered data, the greater the playback delay.
  • the increase in playback delay will greatly reduce the user experience.
  • the user of the mobile phone 13 controls the drone according to the image captured by the drone displayed on the mobile phone 13 . Then, if the time difference between outputting the video image from the drone 11 and receiving and playing the video image on the mobile phone 13 is too large, then the user confirms the drone according to the drone photographed image displayed on the mobile phone 13 . There is a deviation between the flight attitude and the real-time flight attitude of the UAV 11, and the user cannot control the flight attitude smoothly.
  • the user of the mobile phone 23 controls the drone according to the image captured by the drone displayed on the large-screen TV 25 . Then, even if the delay between the video image output by the drone 21 and the reception of the video image by the mobile phone 23 is not considered, if the video image is output from the mobile phone 23 and the video image is received and displayed by the large-screen TV 25, the difference between If the time difference is too large, then, there will be a deviation between the flight attitude of the drone confirmed by the user according to the image captured by the drone displayed on the large-screen TV 25 and the real-time flight attitude of the drone 21, and the user will not be able to smoothly Control the flight attitude.
  • FIG. 4 shows a schematic diagram of an application scenario of transmitting audio and video data respectively.
  • user A and user B use their respective mobile phones (mobile phone 43 and mobile phone 44 ) to implement a video call with each other.
  • user B sends the video image of the video call to the large-screen TV 41 for display through the mobile phone 43, and sends the voice of the video call to the smart speaker 42 for playback.
  • the playback delay between the display of the video image on the large-screen TV 41 and the playback of the voice on the smart speaker 42 increases, and the delay between user A and user B increases.
  • the interaction delay will also increase, which will greatly affect the user experience of the video call between user A and user B.
  • an embodiment of the present application proposes an audio and video playback method based on a catch-up strategy, which identifies audio and video playback scenarios, and selects a corresponding playback strategy according to specific application scenarios.
  • an embodiment of the present application proposes an audio and video playback method based on a catch-up strategy.
  • the playback delay threshold of the interaction scenario is preset.
  • the interactive scene playback delay threshold is the maximum value of playback delay between the audio data and/or video image data output by the output device and the playback end of the audio data and/or video image data on the basis of satisfying the expected user interaction experience. .
  • the playback delay thresholds of the interaction scenes between the mobile phone 43 and the large-screen TV 41 and between the mobile phone 43 and the smart speaker 42 are set to be 150 ms.
  • the application scenario is an interactive application scenario
  • audio playback and/or video image playback is performed based on the catch-up strategy.
  • the preset interactive scene playback delay threshold is called, and in the process of audio playback and/or video image playback, the audio playback and/or video image playback are adjusted to make the audio playback and/or video image playback playback.
  • the delay is less than or equal to the interactive scene playback delay threshold.
  • FIG. 5 is a flowchart of a method for playing audio and video according to an embodiment of the present application.
  • the audio and video playback method is executed by one or more devices in the audio and video playback system (including audio and video data output devices, audio playback terminals, and video image playback terminals). As shown in Figure 5:
  • Step 500 identifying the current application scenario
  • Step 501 judging whether the current application scene is an interactive application scene
  • step 502 is performed;
  • Step 502 calling the catch-up strategy
  • Step 520 using a catch-up strategy to play audio and/or video images, including:
  • Step 521 calling a preset interactive scene playback delay threshold
  • Step 522 adjust audio playback and/or video image playback, so that the playback progress of audio playback and/or video image playback catches up with the progress of audio and video data output, so that audio playback and/or video image playback is compared to audio and video data output.
  • the playback delay is less than or equal to the interactive scene playback delay threshold.
  • the audio and video playback scenarios are identified, and the corresponding playback strategy is selected according to the specific application scenarios, which can greatly improve the user experience of audio and video playback.
  • using a catch-up strategy for audio and video playback in interactive application scenarios can ensure that the playback delay of audio and video playback meets the application requirements of interactive application scenarios, thereby greatly improving the user experience of interactive application scenarios. experience.
  • FIG. 6 is a structural block diagram of an audio and video playback system according to an embodiment of the present application.
  • an audio output device 601 eg, a sound capture device
  • a video image output device 602 eg, an image capture device
  • the audio data and video image data are output to a playback device 603 (eg, a mobile phone), and the playback device 603 plays the received audio data and video image data synchronously.
  • the user realizes the viewing and listening of the video through the playback device 603 .
  • steps 500 , 501 and 502 can be performed by the playback device 603 (for example, a mobile phone).
  • the playback device 603 for example, a mobile phone.
  • the current application scene is an interactive application scene, it invokes the catch-up strategy (calls the preset interactive scene playback delay threshold), and performs playback based on the catch-up strategy.
  • the mobile phone 13 determines that the current application scenario is an interactive application scenario (real-time control of the flight attitude of the drone 11 ).
  • the mobile phone 13 invokes the preset interactive scene playback delay threshold, implements a catch-up strategy based on the interactive scene playback delay threshold, and plays audio data from the microphone 12 and video image data from the drone 11 based on the catch-up strategy.
  • FIG. 7 is a structural block diagram of an audio and video playback system according to an embodiment of the present application.
  • an audio and video output device 701 eg, a mobile phone
  • synchronously outputs audio data and video image data of the same video eg, a video of a video call
  • the audio data is output to the audio playback device 702 (eg, smart speakers)
  • the video image data is output to the video image playback device 703 (eg, a large-screen display device).
  • the audio playing device 702 plays the received audio file
  • the video image playing device 703 plays the received video image file.
  • the user realizes the viewing and listening of the video by watching the video image playing device 703 and listening to the audio playing device 702 .
  • steps 500 , 501 and 502 can be performed by an audio and video output device 701 (eg, a mobile phone).
  • the audio and video output device 701 When the audio and video output device 701 When determining that the current application scene is an interactive application scene, it invokes the catch-up strategy (including the interactive scene playback delay threshold), and sends the catch-up strategy (including the interactive scene playback delay threshold) to the audio playback device 702 and/or video image playback The device 703; the audio playback device 702 and/or the video image playback device 703 implements a catch-up strategy based on the acquired interactive scene playback delay threshold.
  • the catch-up strategy including the interactive scene playback delay threshold
  • the catch-up strategy including the interactive scene playback delay threshold
  • the mobile phone 23 determines that the current application scenario is an interactive application scenario (real-time control of the flight attitude of the drone 21 ).
  • the mobile phone 23 invokes the preset interactive scene playback delay threshold, and sends the interactive scene playback delay threshold to the large-screen TV 25; the large-screen TV 25 implements a catch-up strategy based on the acquired interactive scene playback delay threshold, and plays the data from the mobile phone based on the catch-up strategy. 23 video image data.
  • the mobile phone 43 determines that the current application scenario is an interactive application scenario (making a video call with the mobile phone 44).
  • the mobile phone 43 calls the preset interactive scene playback delay threshold, and sends the interactive scene playback delay threshold to the large-screen TV 41 and the smart speaker 42;
  • the large-screen TV 41 implements a catch-up strategy based on the acquired interactive scene playback delay threshold, and based on the catch-up strategy
  • the strategy plays the video image data from the mobile phone 23 ;
  • the smart speaker 42 implements a catch-up strategy based on the acquired interactive scene playback delay threshold, and plays the audio data from the mobile phone 23 based on the catch-up strategy.
  • the user only uses the mobile phone 13 to perform a virtual tour of the cell with the designated route, and does not control the drone 11 . Therefore, the playback delay between the audio output from the microphone 12 and the playback of the audio from the mobile phone 13, and the playback delay between the video image output from the drone 11 and the playback of the video image from the mobile phone 13 will not significantly affect the user's ability to perform virtual tours. user experience.
  • FIG. 8 is a schematic diagram of an application scenario of transmitting audio and video data respectively.
  • the mobile phone 63 is connected to the large-screen TV 61 and the smart speaker 62 . If the user uses the mobile phone 63 to watch a video (for example, in the embodiment shown in FIG. 1 , a video generated by integrating the video image collected by the drone 11 and the audio collected by the microphone 12 ).
  • the mobile phone 63 plays the video.
  • the mobile phone 63 does not output audio and video images through its own speakers and screen, but sends the video image of the played video to the large-screen TV 61 for display, and the video image of the played video is displayed.
  • the audio is sent to the smart speaker 62 for playback.
  • the playback delay between the video image and audio output by the mobile phone 63 and the video image displayed by the large-screen TV 61 and the audio playback by the smart speaker 62 will not significantly affect the user experience of video viewing.
  • an embodiment of the present application proposes an audio and video playback method based on a smooth playback strategy.
  • the data cache value is set according to the data transmission delay fluctuation situation of the application scenario. Specifically, the data cache value matches the maximum value of the data transmission delay fluctuation, or the data cache value is adjusted in real time according to the data transmission delay fluctuation.
  • audio playback and/or video image playback is performed based on the smooth playback strategy.
  • the smooth playback strategy the audio data and video image data are cached and played based on the data cache value to ensure that the audio and video data can be played smoothly when the data transmission delay fluctuates.
  • step 503 When the current application scenario is not an interactive application scenario (non-interactive scenario), perform step 503;
  • Step 503 calling the smooth playback strategy
  • Step 510 using a smooth playback strategy to play audio and/or video images.
  • step 503 may be performed by a playback device (eg, a mobile phone).
  • a playback device eg, a mobile phone.
  • the catch-up strategy is used to play audio and video for non-interactive application scenarios, which can ensure that the smoothness of audio and video playback meets the application requirements of non-interactive application scenarios, thereby greatly improving the user experience of interactive application scenarios.
  • the mobile phone 13 determines that the current application scenario is a non-interactive application scenario (simple video viewing, without controlling the flight attitude of the drone 11 ).
  • the mobile phone 13 invokes the smooth playback strategy, and plays the audio data from the microphone 12 and the video image data of the drone 11 based on the smooth playback strategy.
  • step 503 can be performed by an audio and video output device (for example, a mobile phone).
  • the audio and video output device determines that the current application scenario is an interactive application scenario, its Invoke the interactive scene playback delay threshold, and deliver the interactive scene playback delay threshold to the audio player and/or the video image player; the audio player and/or the video image player implement a catch-up strategy based on the acquired interactive scene playback delay threshold .
  • the mobile phone 63 determines that the current application scenario is a non-interactive application scenario (simple video playback scenario).
  • the mobile phone 63 invokes the smooth playback strategy and delivers the smooth playback strategy to the large-screen TV 61 and the smart speaker 62;
  • the large-screen TV 61 plays the video image data from the mobile phone 63 based on the smooth playback strategy;
  • the smart speaker 62 plays the video data from the mobile phone 63 based on the smooth playback strategy.
  • Video image data of the mobile phone 63 .
  • the smooth playback strategy can implement by adopting various implementation manners.
  • the data cache value is set according to the fluctuation situation of the data transmission delay in the application scenario.
  • the amount of data buffering is monitored to implement the catch-up strategy, and when the current amount of buffered data reaches the data buffering value, it is negotiated again to increase the buffering.
  • the video playback device reports the central device (audio and video output device); the central device (audio and video output device) monitors the change and renegotiates to the video playback device,
  • the audio playback device issues a new data cache value.
  • an embodiment of the present application proposes an audio playback method, including: when the playback strategy of audio playback is a catch-up strategy, obtaining a preset interactive scene playback delay threshold; Play, during the execution of audio playback, adjust the audio playback so that the playback progress of the audio playback catches up with the progress of the audio and video data output, so that the playback delay of the audio playback compared to the audio and video data output is less than or equal to the preset interactive scene playback delay threshold.
  • the playback delay of the audio playback relative to the audio and video data output is directly monitored; when the playback delay of the audio playback exceeds the playback delay of the interactive scene
  • adjust the audio playback so that the playback progress of the audio playback terminal based on the audio cache data catches up with the audio and video data output progress, so as to reduce the playback delay of the audio playback, so that the playback delay of the audio playback is less than or equal to the interactive scene playback delay threshold.
  • FIG. 9 is a flowchart of an audio playback method according to an embodiment of the present application.
  • the audio player executes the following process as shown in Figure 9:
  • Step 910 monitor the playback delay of audio playback relative to the audio and video data output
  • Step 920 judging whether the playback delay of audio playback exceeds a preset interactive scene playback delay threshold
  • step 930 When the playback delay of audio playback exceeds the preset interactive scene playback delay threshold, perform step 930;
  • Step 930 adjust the audio playback, so that the playback progress of the audio playback terminal based on the audio buffer data to play catches up with the audio and video data output progress, to reduce the playback delay of the audio playback, so that the playback delay of the audio playback is less than or equal to the interactive scene playback delay threshold.
  • step 910 by adopting various solutions.
  • the output device when outputting audio data, the output device adds a time stamp to the audio data, and when playing the audio data, the playback end compares the time stamp in the audio data being played with the current moment to calculate the playback delay.
  • the audio playback terminal plays according to the storage order of the audio buffer data in the audio buffer. If the audio data in the audio buffer is moved out of the buffer after being played, then the amount of data buffered in the audio buffer can be output on behalf of the data output terminal. The difference between the progress of the audio data and the progress of playing the audio data by the audio player. Therefore, in order to simplify the operation process and reduce the difficulty of implementing the solution, in an implementation manner of the catch-up strategy, the buffered amount of audio data is monitored to implement the catch-up strategy.
  • the corresponding buffered data volume threshold is determined according to the preset interactive scene playback delay threshold, and when the current audio buffered data volume exceeds the buffered data volume threshold, it can be determined that the audio playback delay exceeds the interactive scene playback delay threshold. For example, assuming that the interactive scene playback delay threshold is 150ms, according to the audio playback speed, the buffered data volume threshold corresponding to the audio can be calculated to be 4 frames.
  • FIG. 10 is a flowchart of an audio playback method according to an embodiment of the present application.
  • the audio player executes the following process as shown in Figure 10:
  • Step 1010 monitor the data buffer amount of unplayed data of audio playback
  • Step 1020 judging whether the audio data buffering amount exceeds a preset audio data buffering threshold, wherein the data buffering threshold is a value determined according to the interactive scene playback delay threshold;
  • step 1030 When the audio data buffering amount exceeds the preset audio data buffering threshold, perform step 1030;
  • Step 1030 Adjust the audio data buffering amount of the unplayed audio data, so that the audio data buffering amount is less than or equal to a preset audio data buffering threshold.
  • step 522 when the audio playback end plays based on the audio buffer data, the audio data is not read and played in sequence according to the storage order of the data in the buffer, but the audio data is read and played in a skip mode.
  • step 522 when the playback progress of the audio player needs to be made to catch up with the progress of the audio and video data output, delete all or part of the cache of the audio player.
  • the audio data is not played, so that the playback of the player side skips the deleted data, so that the audio playback progress of the audio player side catches up with the output progress of the audio and video data.
  • the deletion amount of the audio data corresponds to the playback progress of the current audio playback, so that the playback progress of the audio playback terminal satisfies the interactive scene playback delay threshold.
  • the deletion amount of unplayed audio data corresponds to a value where the playback delay exceeds a playback delay threshold of the interactive scene.
  • step 1030 the data in the audio buffer is deleted, so that the audio data buffer volume is less than or equal to the audio buffer data volume threshold.
  • step 522 those skilled in the art may adopt various different schemes to delete the unplayed audio data in the cache. For example, delete the earliest unplayed audio data stored in the audio buffer; another example, randomly select the data to be deleted from the unplayed audio data in the audio buffer; Select the data to be deleted from the audio data.
  • step 522 the waveform and frequency of the audio data in the audio cache are monitored, and when necessary When deleting audio data in the cache, preferentially delete audio data that is not sensitive to human ears.
  • the sound frequency range that normal people can feel is 20-20000Hz, but the human ear is most sensitive to sound in the frequency range of 1000-3000Hz.
  • the audio frames are selectively dropped according to the sensitivity priority of the human ear.
  • the format of the transmitted audio frame is generally Advanced Audio Coding (AAC).
  • AAC Advanced Audio Coding
  • the frequency domain information of the frame data is extracted (usually through fast Fourier transform), and the sensitivity of this frame of audio to the human ear is analyzed (for example, 3000Hz is the most sensitive, and the sensitivity is represented by 100). ; 20000Hz is the least sensitive, and the sensitivity is represented by 0).
  • the main control module puts a part of the AAC bit stream (AAC Stream) into the input buffer, and obtains the start of a frame by searching for the synchronization word.
  • AAC Stream AAC bit stream
  • the syntax described in IEC 13818-7 starts decoding.
  • the main task of the main control module is to operate the input and output buffers and call other modules to work together.
  • the input and output buffers are provided with interfaces by a digital signal processing (digital signal processing, DSP) control module.
  • the data stored in the output buffer is the decoded Pulse Code Modulation (PCM) data, which represents the amplitude of the sound. It consists of a fixed-length buffer.
  • PCM Pulse Code Modulation
  • the interrupt processing is called to output the audio analog/digital converter (Analog) connected to the I2S interface.
  • ADC analog/digital converter
  • DAC digital to analog converter
  • DirectDrive direct drive
  • FIG. 11 is a schematic diagram showing the decoding process of AAC audio data. As shown in Figure 11, AAC Stream goes through:
  • Step 1101 noiseless decoding (Noisless Decoding), noiseless coding is Huffman coding, and its function is to further reduce the redundancy of the scale factor and the quantized spectrum, that is, to perform Huffman coding on the scale factor and the quantized spectrum information. ;
  • Step 1102 Dequantize
  • Step 1103 joint stereo (Joint Stereo), joint stereo is a certain rendering work performed on the original sample, so that the sound is more "good”;
  • Step 1104 perceptual noise substitution (PNS).
  • Step 1105 transient noise shaping (temporary noise shaping, TNS);
  • Step 1106 inverse discrete cosine transform (IMDCT).
  • Step 1107 frequency band replication (Spectral Band Replication, SBR);
  • the PCM code stream of the left and right channels is obtained, and then the main control module puts it into the output buffer and outputs it to the sound playback device.
  • the sensitive distribution of each audio frame in the current audio buffer queue is calculated first, and then which frames are lost. For example, frames with a sensitivity level below 60 account for 50%. If half of the audio frames need to be discarded at present, the frames with a sensitivity level below 60 are discarded.
  • the catch-up strategy for video image playback can be implemented in a manner similar to the above-mentioned implementation manner of the catch-up strategy for audio playback.
  • an embodiment of the present application also proposes a video image playback method, including: when the playback strategy of video image playback is a catch-up strategy, acquiring a preset interactive scene playback delay threshold; Play, during the execution of the video image playback, adjust the video image playback, so that the playback progress of the video image playback catches up with the progress of the audio and video data output, so that the playback delay of the video image playback compared to the audio and video data output is less than or equal to the preset.
  • the interactive scene playback delay threshold is a preset interactive scene playback delay threshold.
  • the playback delay of the video image playback relative to the audio and video data output is directly monitored; when the playback delay of the video image playback exceeds the interaction
  • the scene playback delay threshold is set, adjust the video image playback so that the playback progress of the video image playback terminal based on the video image cache data catches up with the audio and video data output progress, so as to reduce the playback delay of the video image playback and make the playback of the video image playback.
  • the delay is less than or equal to the interactive scene playback delay threshold.
  • FIG. 12 is a flowchart of a method for playing a video image according to an embodiment of the present application.
  • the video image player performs the following process as shown in Figure 12:
  • Step 1110 monitor the playback delay of the video image playback relative to the audio and video data output
  • Step 1120 judging whether the playback delay of the video image playback exceeds a preset interactive scene playback delay threshold
  • step 1130 When the playback delay of the video image playback exceeds the preset interactive scene playback delay threshold, perform step 1130;
  • Step 1130 adjust the video image playback, so that the playback progress of the video image playback terminal based on the video image cache data to catch up with the audio and video data output progress, to reduce the playback delay of the video image playback, so that the playback delay of the video image playback is less than or equal to Interaction scene playback delay threshold.
  • step 1110 when outputting video image data, the output device adds a time stamp to the video image data, and the video image player compares the time stamp in the video image data being played with the current moment when playing the video image data, so as to calculate the playback delay .
  • the video image player plays according to the storage order of the video image cache data in the video image cache. If the video image data in the video image cache is moved out of the cache after being played, then the amount of data cached in the video image cache can be It represents the difference between the progress of outputting video image data at the data output end and the progress of playing video image data at the video image playback end. Therefore, in order to simplify the operation process and reduce the difficulty of implementing the solution, in an implementation manner of the catch-up strategy, the buffered amount of video image data is monitored to implement the catch-up strategy.
  • the corresponding video image buffer data volume threshold is determined, and when the current video image buffer data volume exceeds the video image buffer data volume threshold, it can be determined that the playback delay exceeds the interactive scene playback delay threshold.
  • the playback delay threshold of the interactive scene is 150ms
  • the buffered data volume threshold corresponding to the video image can be calculated to be 3 frames.
  • FIG. 13 is a flowchart of a method for playing a video image according to an embodiment of the present application.
  • the video image player performs the following process as shown in Figure 13:
  • Step 1210 monitor the data buffer amount of unplayed data of audio playback and/or video image playback
  • Step 1220 judging whether the video image data buffering amount exceeds a preset data buffering threshold, wherein: the data buffering threshold is a value determined according to the interactive scene playback delay threshold; when the execution device is an audio player, the data buffering threshold is audio data Cache threshold; when the execution device is the video image player, the data cache threshold is the video image data cache threshold;
  • step 1230 When the data cache amount exceeds the preset data cache threshold, perform step 1230;
  • Step 1230 Adjust the data buffering amount of the unplayed video image data, so that the video image data buffering amount is less than or equal to a preset data buffering threshold.
  • step 522 by adopting various solutions.
  • the playback speed of video image playback is accelerated (for example, the original playback speed of 10 frames/second is adjusted to 15 frames/second); for another example, when the video image playback end plays based on the video image cache data, it does not follow the cached data.
  • the data storage order the data is read and played in sequence, but the data is read and played in a skip mode (for example, for video image buffer data, the mode of reading and playing every other frame is used for reading and playing).
  • step 522 when it is necessary to make the playback progress of the video image player catch up with the progress of the audio and video data output, delete all the buffers in the buffer of the video image player. Or part of the video image data is not played, so that the playback of the video image player skips the deleted data, so that the playback progress of the video image player catches up with the audio and video data output progress.
  • the deletion amount of the video image data corresponds to the playback progress of the current video image playback, so that the playback progress of the video image playback terminal satisfies the interactive scene playback delay threshold.
  • the deletion amount of the unplayed video image data corresponds to the value of the video image playback delay exceeding the interactive scene playback delay threshold.
  • step 1230 the data in the video image buffer is deleted, so that the video image data buffer amount is less than or equal to the video image buffer data amount threshold.
  • step 522 those skilled in the art may adopt various solutions to delete the unplayed video image data in the video image cache. For example, delete the earliest unplayed video image data stored in the video image cache; another example, randomly select the data to be deleted from the unplayed video image data in the video image cache; The data to be deleted is selected from the unplayed video image data in the cache.
  • step 522 when the video image data in the cache needs to be deleted, the video frames to be deleted are selected according to fixed intervals. .
  • the fluctuation of transmission delay will also cause the audio and video playback to be unsynchronized, thereby reducing the user experience.
  • independent equipment is used to collect and transmit audio data and video image data respectively.
  • the simultaneously captured audio data and video image data are transmitted separately. Due to the fluctuation of transmission delay, the transmission delay of audio data and video data may be different, then, the audio data and video data received by the player will be out of synchronization, which may lead to out-of-sync audio data and video data playback. .
  • the microphone 12 when the drone 11 collects the video image data A1, the microphone 12 will collect the audio data B1 that is synchronized with the video image data A1.
  • the microphone 12 transmits the audio data B1 to the mobile phone 13. If the video image data A1 and the audio data B1 are received synchronously on the mobile phone 13, then, if the mobile phone 13.
  • the video image data A1 and the audio data B1 are played at the same time as the video image data A1 and the audio data B1 are received, and the playback of the video image data A1 and the audio data B1 is synchronized.
  • the transmission delay of the microphone 12 transmitting the audio data B1 to the mobile phone 13 is greater than the transmission delay of the drone 11 transmitting the video image data A1 to the mobile phone 13, then the reception of the video image data A1 and the audio data B1 on the mobile phone 13 is If it is not synchronized, the mobile phone 13 will first receive the video image data A1. If the mobile phone 13 plays the video image data A1 while receiving the video image data A1, when the mobile phone 13 starts to play the video image data A1, the mobile phone 13 has not yet received the audio data B1, and finally the audio and video playback on the mobile phone 13 will not be synchronized. The user experience of virtual tours in the community will be greatly reduced.
  • a solution for solving the asynchronous playback of audio and video is to integrate the video image data and audio data before playing the video image data and the audio data to generate a video file with audio and video synchronization, and play the audio and video files.
  • the video file to achieve audio and video synchronization.
  • the mobile phone 13 receives video image data A1 and audio data B1, integrates them to generate a video file C1, and the mobile phone 13 plays the video file C1 to realize audio and video synchronization.
  • the application (APP) for receiving the video image data of the drone 11 and the application for receiving the audio data of the microphone 12 are applications of different manufacturers.
  • the application that receives the video image data of the drone 11 can realize the simultaneous playback of the video image data
  • the application that receives the audio data of the microphone 12 can realize the simultaneous playback of the audio data.
  • the integration of the audio and video data cannot be easily realized. For example, video image data and audio data need to be imported into a third-party video production application, and then integrated after synchronizing time tags, to generate video files.
  • the user can only perform a virtual tour after the drone 11 and the microphone 12 have completed the audio and video collection, but not when the drone 11 and the microphone 12 are collecting audio and video. Take a synchronized virtual tour.
  • different playback terminals are used to play audio data and video image data respectively.
  • the difference in transmission delay between the transmission of audio data from the mobile phone 43 to the smart speaker 42 and the transmission of video image data from the mobile phone 43 to the large-screen TV 41 may cause audio and video playback. out of sync.
  • the difference in transmission delay between the transmission of audio data from the mobile phone 63 to the smart speaker 62 and the transmission of video image data from the mobile phone 63 to the large-screen TV 41 may cause audio and video playback. out of sync.
  • an audio and video synchronization operation is introduced.
  • a catch-up strategy is adopted for both audio playback and video image playback to achieve audio and video synchronization. That is, a catch-up strategy is used for audio playback and video image playback. Adjust the audio playback and video image playback so that the playback progress of the audio playback terminal and the video image playback terminal catches up with the progress of the audio and video data output, so that the playback delay of the audio playback and video image playback compared to the audio and video data output is less than or equal to the interactive scene. Playback delay threshold.
  • Using the catch-up strategy for audio playback and video image playback can control the audio playback delay of audio playback and the video image playback delay of video image playback within the interactive scene playback delay threshold. Since playback delay is the delay between reference data playback and data output, and audio output and video image output are synchronized, the difference in playback progress between audio playback and video image playback is also controlled within The interactive scene playback delay threshold is within the threshold, thus realizing the synchronization of audio playback and video image playback.
  • a catch-up strategy is only used for one playback process in audio playback and video image playback, so that the playback delay of the playback process using the catch-up strategy is controlled within the interactive scene playback delay threshold.
  • the synchronization strategy is adopted for playback.
  • the playback process using the synchronization strategy is adjusted so that the playback progress of the playback process using the synchronization strategy is synchronized with the playback progress of the playback process using the catch-up strategy.
  • a catch-up strategy based on audio and video output progress is adopted; for video image playback, a synchronization strategy based on audio playback progress is adopted.
  • FIG. 14 is a flowchart of a method for playing audio and video according to an embodiment of the present application.
  • the audio and video playback method is executed by one or more devices in the audio and video playback system (including audio and video data output devices, audio playback terminals, and video image playback terminals). As shown in Figure 14:
  • Step 1300 identifying the current application scenario
  • Step 1301 judging whether the current application scene is an interactive application scene
  • step 1310 When the current application scenario is not an interactive application scenario, perform step 1310;
  • Step 1310 using a smooth playback strategy to play audio and/or video images
  • step 1120 When the current application scenario is an interactive application scenario, perform step 1120;
  • Step 1320 using the catch-up strategy to play audio
  • Step 1330 determine whether the playback progress of the video image is synchronized with the playback progress of the audio
  • step 1340 When the playback progress of the video image is not synchronized with the playback progress of the audio, perform step 1340;
  • Step 1340 Adjust the playback of the video image based on the playback progress of the audio, so that the playback progress of the video image is synchronized with the playback progress of the audio.
  • the video image is used to reflect the status of the remote device, and the live sound is only used to assist, and the video image freeze can easily lead to control errors, so it is necessary to give priority to ensuring the video The smoothness of image playback, minimize the number of times to adjust the video image playback progress.
  • a catch-up strategy based on audio and video output progress is adopted; for audio playback, a synchronization strategy based on video image playback progress is adopted.
  • FIG. 15 is a flowchart of a method for playing audio and video according to an embodiment of the present application.
  • the audio and video playback method is executed by one or more devices in the audio and video playback system (including audio and video data output devices, audio playback terminals, and video image playback terminals). As shown in Figure 15:
  • Step 1400 identifying the current application scenario
  • Step 1401 judging whether the current application scene is an interactive application scene
  • step 1410 When the current application scenario is not an interactive application scenario, perform step 1410;
  • Step 1410 using a smooth playback strategy to play audio and/or video images
  • step 1420 When the current application scenario is an interactive application scenario, perform step 1420;
  • Step 1420 using a catch-up strategy to play video images
  • Step 1430 determine whether the playback progress of the video image is synchronized with the playback progress of the audio
  • step 1420 When the playback progress of the video image is synchronized with the playback progress of the audio, return to step 1420;
  • step 1440 When the playback progress of the video image is not synchronized with the playback progress of the audio, perform step 1440;
  • Step 1440 Adjust the audio playback based on the playback progress of the video image, so that the playback progress of the audio is synchronized with the playback progress of the video image.
  • the judgment based on the synchronous playback delay threshold is adopted to judge whether the playback progress of the audio is synchronized with the playback progress of the video image.
  • a synchronous playback delay threshold is preset, and the synchronous playback delay threshold is the maximum value of the difference in playback progress between audio playback and video playback on the basis of satisfying the expected user video viewing experience. For example, when the difference between the playback progress of audio playback and video playback exceeds 150ms, the user will feel obvious inconsistency of audio and video playback. Therefore, the synchronous playback delay threshold is set to 150ms. When the difference value between the playback progress of the audio and the playback progress of the video image exceeds the synchronous playback delay threshold, it is determined that the playback progress of the audio is not synchronized with the playback progress of the video image.
  • the difference in playback progress can be represented by a timestamp.
  • the playback time (time stamp T1) is recorded; when the video image data B9 is played at the video image playback end, the playback time is recorded. (time stamp T2); the interval duration between T1 and T2 is the difference in playback progress between audio playback and video image playback.
  • the way of deleting the unplayed data in the cache can be used to accelerate the progress of playing, or the way of adding transition data to the unplayed data in the cache can be used to delay.
  • the playback progress is advanced, so that the playback progress of the audio is synchronized with the playback progress of the video image.
  • a transition frame is added between two video image frames (the transition frame can be a duplicate frame of adjacent video image frames), so that the playback of the transition frame is added to the playback process of the video image playback end, thereby delaying the The progress of the video image playback progress.
  • step 520 audio and video synchronization is implemented based on monitoring the data transmission delay.
  • the playback cache link is set. After the playback device receives the audio data or video image data, before the audio data or video image data is played, the audio data or video image data is additionally buffered for a certain period of time (playback buffering) before playing, and the buffering period is the audio data transmission.
  • playback buffering a certain period of time
  • the difference between the delay and the transmission delay of the video image data so that the difference between the transmission delay of the audio data and the transmission delay of the video image data can be compensated to ensure that the audio data and the video image data are played synchronously.
  • the playback buffering link is limited.
  • the duration of cached data cannot exceed the interactive scene playback delay threshold. Since the cache duration of the playback cache is the difference between the audio data transmission delay and the video image data transmission delay, when the difference between the audio data transmission delay and the video image data transmission delay exceeds the interactive scene playback delay threshold, Delete the cached data in the cache and add new data in the cache, so as to realize the catch-up strategy on the premise of maintaining the synchronization of audio and video playback.
  • the mobile phone 63 uses the mobile phone 63 to play games.
  • the game screen projection mode the mobile phone 63 does not output game audio and game video images through its own speakers and screen, but sends the game video images to the large-screen TV 61 for display, and sends the game audio to the smart speaker 62 for playback.
  • the user obtains the game content through the image output of the large-screen TV 61 and the audio output of the smart speaker 62, and performs corresponding game operations on the mobile phone. Since there is real-time interaction between the user and the mobile phone 63, the game screen projection application scenario is an interactive application scenario. Specifically, after the user activates the game screen projection mode, the mobile phone 63, the large-screen TV 61 and the smart speaker 62 execute the following processes:
  • the mobile phone 63 After the mobile phone 63 recognizes the game screen projection scene, it establishes connections with the large-screen TV 61 and the smart speaker 62 respectively;
  • the large-screen TV 61 reports a network delay of 100ms to the mobile phone 63, and the smart speaker 62 reports a network delay of 200ms to the mobile phone 63;
  • the mobile phone 63 sends a catch-up strategy to the large-screen TV 61 and the smart speaker 62 respectively (the interactive scene playback delay threshold is 150ms), and the mobile phone 63 sends the network delay data of the large-screen TV 61 to the smart speaker 62, and sends the network delay data of the smart speaker 62. to the big screen TV 61;
  • the mobile phone 63 starts to send video image frames to the large-screen TV 61 and audio frames to the smart speaker 62;
  • the smart speaker 62 learns that the network delay of the large-screen TV 61 is 100ms, which is less than its own network delay of 200ms, so it directly plays the audio frame;
  • the large-screen TV 61 knows that the network delay of the smart speaker 62 is 200ms, so the video image frame is buffered for 100ms ( ⁇ interactive scene playback delay threshold of 150ms) before decoding and playback, thereby synchronizing with the audio frame playback of the smart speaker 62.
  • the network delay of the smart speaker 62 further deteriorates to 300ms
  • the network delay of the large-screen TV 61 is still 100ms
  • the smart speaker 62 reports the network delay to the mobile phone 63 300ms
  • the large-screen TV 61 reports the network delay to the mobile phone 63 100ms ;
  • the mobile phone 63 notifies the large-screen TV 61 and the smart speaker 62 of the updated network delay data respectively;
  • the large-screen TV 61 updates the smart speaker 62 with a network delay of 300ms.
  • the calculation needs to be cached for 200ms to synchronize with the audio playback, but the threshold of 150ms is exceeded. Therefore, after discarding the first 50ms of video image data that was previously cached for 100ms, and then caching the video image data for 100ms ( 150ms in total), decode and play the video image data, so as to synchronize the video image playback with the audio playback.
  • the network of the large-screen TV 61 is interrupted for 50ms and then continues, since the current large-screen TV 61 has a video buffer of 150ms, within 50ms of the flash, there will continue to be video image data available for playback and the audio of the smart speaker 62. Data is played synchronously.
  • step 520 the transmission delay of the audio playback device and the video image playback device is periodically refreshed, and playback buffering is performed periodically based on the data transmission delay to synchronize audio playback and video image playback.
  • the synchronization effect is improved by adopting a combination of a synchronization audio and video playback scheme based on data transmission delay and other audio and video synchronization schemes.
  • a time stamp synchronization operation periodically.
  • compare the time stamps of the currently playing audio data and video image data to confirm the difference in playback progress and adjust the audio playback.
  • video image playback so that the playback progress difference between audio playback and video image playback is controlled within the synchronous playback delay threshold to achieve audio and video playback synchronization.
  • the audio and video playback scheme based on data transmission delay is used to synchronize audio and video playback.
  • the data transmission delay of audio data and video image data is regularly refreshed, and playback buffering is performed according to the difference between data transmission delays to achieve audio and video playback synchronization. .
  • the playback device does not play the audio data or video image data immediately after receiving the audio data, but buffers a certain amount of data to cope with fluctuations in transmission delay.
  • the playback device extracts data from the audio buffer or video image buffer, so that the extracted data enters the playback link.
  • a playback buffer link is added in the playback link. Specifically, after the audio data or the video image data is extracted from the audio buffer or the video image buffer, before the audio data or the video image data is played, the audio data or the video image data are additionally buffered for a certain period of time (playback buffer).
  • the buffering time is the difference between the audio data transmission delay and the video image data transmission delay, so that the difference between the audio data transmission delay and the video image data transmission delay can be compensated, so as to ensure that the audio data is synchronized with the video image data. play.
  • FIG. 16 shows a partial flowchart of an audio and video playback method according to an embodiment of the present application.
  • the following process shown in Figure 16 is performed to realize audio and video playback synchronization:
  • Step 1510 Before the transmission of audio data and video data, the initial transmission delay of audio data transmission and video image transmission is obtained, and the data type with high transmission delay is the first type of data, and the data type with low transmission delay is the second type of data.
  • Type data the initial data transmission delay of the first type of data is the first delay, and the initial data transmission delay of the second type of data is the second delay (when the transmission delays are the same, the first type of data and the second type of data are arbitrarily defined);
  • Step 1511 determine whether the first delay difference is less than or equal to the interactive scene playback delay threshold, wherein the first delay difference is the difference between the first delay and the second delay;
  • Step 1512 when the first delay difference is less than or equal to the interactive scene playback delay threshold, transmit the first type of data and the second type of data, and play directly after the first type of data enters the playback link, and after the second type of data enters the playback link. Play the second type of data after buffering the first delay difference;
  • Step 1513 when the first delay difference is greater than the interactive scene playback delay threshold, it means that in the current application scenario, if the audio and video synchronization is to be ensured, the real-time interactive requirements cannot be met. Therefore, the output delay is too high to remind the user to remind the user of the current data. The transmission link cannot meet the real-time interaction requirements.
  • the second type of data is buffered and then played, it is equivalent to the second type of data waiting for the first type of data, so it can be ensured that the first type of data and the second type of data are played synchronously.
  • step 1512 After step 1512 is performed, the following steps are also performed:
  • Step 1520 obtaining the current transmission delay of audio data transmission and video image transmission, taking the current data transmission delay of the first type of data as the third delay, and the current data transmission delay of the second type of data as the fourth delay;
  • Step 1521 determine whether the current transmission delay of the first type of data is higher than that of the second type of data
  • Step 1531 when the current transmission delay of the first type of data is higher than that of the second type of data, determine whether the second delay difference is less than or equal to the interactive scene playback delay threshold, where the second delay difference is the third delay and the third delay.
  • Step 1532 when the second delay difference is less than or equal to the interactive scene playback delay threshold, directly play the first type of data after entering the playback link, and cache the second type of data after the second type of data enters the playback link.
  • Step 1533 when the second delay difference is greater than the interactive scene playback delay threshold, directly play the first type of data after entering the playback link; delete some data in the cached second type of data, and the deletion amount is the second delay difference
  • the value that exceeds the interactive scene playback delay threshold increases the cache amount of the second type of data, and starts to play the second type of data when the cache amount of the second type of data reaches the interactive scene playback delay threshold.
  • step 1521 the following steps are also executed:
  • Step 1541 when the current transmission delay of the second type of data is higher than that of the first type of data, determine whether the third delay difference is less than or equal to the interactive scene playback delay threshold, where the third delay difference is the fourth delay and the third delay.
  • Step 1542 when the third delay difference is less than or equal to the interactive scene playback delay threshold, directly play the second type of data after entering the playback link, and cache the first type of data after the first type of data enters the playback link.
  • Step 1543 when the third delay difference is greater than the interactive scene playback delay threshold, directly play the second type of data after entering the playback link; delete some data in the cached first type of data, and the deletion amount is the third delay difference
  • the value that exceeds the interactive scene playback delay threshold increases the buffer amount of the first type of data, and starts to play the first type of data when the buffered amount of the first type of data reaches the interactive scene playback delay threshold.
  • an audio and video synchronization operation is introduced based on a smooth playback strategy.
  • a synchronization strategy is used to play a playback process of audio playback and video image playback.
  • the playback process with the synchronization strategy is adjusted so that the playback progress of the playback process with the synchronization strategy is the same as the playback progress of the playback process with the smooth playback strategy. Synchronize.
  • a smooth playback strategy is adopted for audio playback, and a synchronization strategy based on audio playback progress is adopted for video image playback.
  • a smooth playback strategy is adopted for video image playback, and a synchronization strategy based on video image playback progress is adopted for audio playback.
  • a smooth playback strategy is adopted for audio playback and video image playback.
  • a synchronization strategy based on audio playback progress is adopted on the basis of smooth playback.
  • a smooth playback strategy is adopted for audio playback and video image playback.
  • a synchronization strategy based on video and image playback progress is adopted on the basis of smooth playback.
  • step 510 those skilled in the art can implement the synchronization strategy by adopting various implementation manners.
  • the method based on the synchronous playback delay threshold is used to judge whether the playback progress of the audio is synchronized with the playback progress of the video image (for example, the difference of the playback progress is reflected by the timestamp); during the playback process, adjust the audio playback or video playback progress.
  • Image playback so that the playback progress difference between audio playback and video image playback is controlled within the synchronous playback delay threshold, so that audio and video playback synchronization is achieved on the basis of satisfying the real-time performance of audio playback and video image playback.
  • the playback progress can be accelerated by deleting the unplayed data in the cache, or the progress of the playback progress can be delayed by adding transition data to the unplayed data in the cache, so that the playback progress of the audio is different from the playback progress.
  • the playback progress of the video image is synchronized.
  • audio and video synchronization is implemented based on monitoring the data transmission delay.
  • PLD Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • HDL Hardware Description Language
  • ABEL Advanced Boolean Expression Language
  • AHDL Altera Hardware Description Language
  • HDCal JHDL
  • Lava Lava
  • Lola MyHDL
  • PALASM RHDL
  • VHDL Very-High-Speed Integrated Circuit Hardware Description Language
  • Verilog Verilog
  • the controller may be implemented in any suitable manner, for example, the controller may take the form of eg a microprocessor or processor and a computer readable medium storing computer readable program code (eg software or firmware) executable by the (micro)processor , logic gates, switches, application specific integrated circuits (ASICs), programmable logic controllers and embedded microcontrollers, examples of controllers include but are not limited to the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicon Labs C8051F320, the memory controller can also be implemented as part of the control logic of the memory.
  • the controller may take the form of eg a microprocessor or processor and a computer readable medium storing computer readable program code (eg software or firmware) executable by the (micro)processor , logic gates, switches, application specific integrated circuits (ASICs), programmable logic controllers and embedded microcontrollers
  • ASICs application specific integrated circuits
  • controllers include but are not limited to
  • the controller in addition to implementing the controller in the form of pure computer-readable program code, the controller can be implemented as logic gates, switches, application-specific integrated circuits, programmable logic controllers and embedded devices by logically programming the method steps.
  • the same function can be realized in the form of a microcontroller, etc. Therefore, such a controller can be regarded as a hardware component, and the devices included therein for realizing various functions can also be regarded as a structure within the hardware component. Or even, the means for implementing various functions can be regarded as both a software module implementing a method and a structure within a hardware component.
  • an embodiment of the present application proposes an audio and video playback device, which is installed in the audio and video output device 701 as shown in FIG. 7 , and the device includes:
  • a scene identification module which is used to identify the current application scene
  • a playback strategy configuration module which is used to configure the chasing strategy to be a playback strategy of audio playback and/or video image playback when the current application scenario is an interactive application scenario, wherein the chasing strategy includes:
  • the playback delay is less than or equal to the preset interactive scene playback delay threshold.
  • play strategy configuration module is also used for:
  • the smooth playback strategy As a playback strategy for audio playback and/or video image playback.
  • an embodiment of the present application proposes an audio playback device.
  • the device is installed in the audio playback device 702 as shown in FIG. 7 , and the audio playback device includes:
  • a threshold acquisition module which is used to acquire a preset interactive scene playback delay threshold when the playback strategy of audio playback is a catch-up strategy
  • the first playback adjustment module is used to adjust the audio playback in the process of audio playback when the playback strategy of the audio playback is a catch-up strategy, so that the playback progress of the audio playback catches up with the output of the audio and video data, so that the audio playback phase is comparable.
  • the playback delay compared to the audio and video data output is less than or equal to the preset interactive scene playback delay threshold.
  • the audio playback device also includes:
  • the second playback adjustment module is configured to adjust the audio playback based on the smooth playback strategy during the audio playback when the playback strategy of the audio playback is the smooth playback strategy.
  • an embodiment of the present application proposes a video image playback device.
  • the device is installed in the video image playback device 703 as shown in FIG. 7 , and the video image playback device includes:
  • a threshold acquisition module which is used to acquire a preset interactive scene playback delay threshold when the playback strategy for video image playback is a catch-up strategy
  • the first playback adjustment module is used to adjust the playback of the video image in the process of playing the video image when the playback strategy of the video image playback is a catch-up strategy, so that the playback progress of the video image playback catches up with the output of the audio and video data.
  • the playback delay of the video image playback compared to the audio and video data output is less than or equal to the preset interactive scene playback delay threshold.
  • the video image playback device also includes:
  • the second playback adjustment module is configured to adjust the playback of the video image based on the smooth playback strategy during the process of playing the video image when the playback strategy of the video image playback is the smooth playback strategy.
  • an embodiment of the present application proposes an audio, video and image playback device, the device is installed in the playback device 603 as shown in FIG. 6 , and the device includes:
  • a scene identification module which is used to identify the current application scene
  • a playback strategy configuration module which is used to configure the catch-up strategy as a playback strategy for audio playback and/or video image playback when the current application scenario is an interactive application scenario; and configure smooth playback when the current application scenario is a non-interactive application scenario
  • the strategy is the playback strategy for audio playback and/or video image playback
  • a threshold acquisition module which is used to acquire a preset interactive scene playback delay threshold when the playback strategy of audio playback and/or video image playback is a catch-up strategy
  • the first playback adjustment module which is used to adjust audio playback and/or video image playback when the playback strategy of audio playback and/or video image playback is a catch-up strategy, so that the playback progress of audio playback and/or video image playback can catch up with the sound.
  • the progress of video data output so that the playback delay of audio playback and/or video image playback is less than or equal to the preset interactive scene playback delay threshold compared to the audio and video data output;
  • the second playback adjustment module is configured to adjust audio playback and/or video image playback based on the smooth playback strategy when the playback strategy for audio playback and/or video image playback is a smooth playback strategy.
  • each module/unit for the convenience of description, when describing the device, functions are divided into various modules/units for description, and the division of each module/unit is only a logical function division. At the same time, the functions of each module/unit may be implemented in one or more software and/or hardware.
  • the apparatuses proposed in the embodiments of the present application may be fully or partially integrated into a physical entity during actual implementation, or may be physically separated.
  • these modules can all be implemented in the form of software calling through processing elements; they can also all be implemented in hardware; some modules can also be implemented in the form of software calling through processing elements, and some modules can be implemented in hardware.
  • the detection module may be a separately established processing element, or may be integrated in a certain chip of the electronic device.
  • the implementation of other modules is similar.
  • all or part of these modules can be integrated together, and can also be implemented independently.
  • each step of the above-mentioned method or each of the above-mentioned modules can be completed by an integrated logic circuit of hardware in the processor element or an instruction in the form of software.
  • the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more specific integrated circuits (Application Specific Integrated Circuit, ASIC), or, one or more digital signal processors ( Digital Singnal Processor, DSP), or, one or more Field Programmable Gate Array (Field Programmable Gate Array, FPGA), etc.
  • ASIC Application Specific Integrated Circuit
  • DSP Digital Singnal Processor
  • FPGA Field Programmable Gate Array
  • these modules can be integrated together and implemented in the form of an on-chip device (System-On-a-Chip, SOC).
  • An embodiment of the present application also proposes an electronic device (audio and video data output device), the electronic device includes a memory for storing computer program instructions and a processor for executing the program instructions, wherein when the computer program instructions are executed by the computer program When the processor executes, the electronic device is triggered to execute the steps of the audio and video playback method described in the embodiments of the present application.
  • An embodiment of the present application also proposes an electronic device (audio playback device), the electronic device includes a memory for storing computer program instructions and a processor for executing the program instructions, wherein when the computer program instructions are executed by the processor During execution, the electronic device is triggered to execute the steps of the audio playback method described in the embodiments of the present application.
  • An embodiment of the present application also proposes an electronic device (video image playback device), the electronic device includes a memory for storing computer program instructions and a processor for executing the program instructions, wherein when the computer program instructions are processed by the When the controller is executed, the electronic device is triggered to execute the steps of the video image playback method described in the embodiments of the present application.
  • An embodiment of the present application also proposes an electronic device (audio and video playback device), the electronic device includes a memory for storing computer program instructions and a processor for executing the program instructions, wherein when the computer program instructions are processed by the When the device is executed, the electronic device is triggered to execute the steps of the audio and video playback method described in the embodiments of the present application.
  • the above-mentioned one or more computer programs are stored in the above-mentioned memory, and the above-mentioned one or more computer programs include instructions.
  • the above-mentioned instructions are executed by the above-mentioned device, the above-mentioned device is made to execute the application. The method steps described in the examples.
  • the processor of the electronic device may be an on-chip device SOC, and the processor may include a central processing unit (Central Processing Unit, CPU), and may further include other types of processors.
  • the processor of the electronic device may be a PWM control chip.
  • the involved processor may include, for example, a CPU, a DSP, a microcontroller, or a digital signal processor, and may also include a GPU, an embedded Neural-network Process Units (NPU, NPU) ) and an image signal processor (Image Signal Processing, ISP), the processor may also include necessary hardware accelerators or logic processing hardware circuits, such as ASICs, or one or more integrated circuits for controlling the execution of programs in the technical solution of the present application Wait. Furthermore, the processor may have the function of operating one or more software programs, which may be stored in a storage medium.
  • the memory of the electronic device may be a read-only memory (ROM), other types of static storage devices that can store static information and instructions, random access memory (random access memory) memory, RAM) or other types of dynamic storage devices that can store information and instructions, also can be electrically erasable programmable read-only memory (electrically erasable programmable read-only memory, EEPROM), compact disc read-only memory, CD-ROM) or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, Blu-ray disk, etc.), magnetic disk storage medium or other magnetic storage device, or can also be used for portable or Any computer-readable medium that stores desired program code in the form of instructions or data structures and can be accessed by a computer.
  • ROM read-only memory
  • RAM random access memory
  • dynamic storage devices that can store information and instructions
  • EEPROM electrically erasable programmable read-only memory
  • CD-ROM compact disc read-only memory
  • optical disk storage including compact disk, laser disk, optical disk, digital versatile disk
  • a processor may be combined with a memory to form a processing device, which is more commonly an independent component.
  • the processor is used to execute program codes stored in the memory to implement the method described in the embodiment of the present application.
  • the memory can also be integrated in the processor, or be independent of the processor.
  • the device, apparatus, module or unit described in the embodiments of the present application may be specifically implemented by a computer chip or entity, or implemented by a product having a certain function.
  • the embodiments of the present application may be provided as a method, an apparatus, or a computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied therein.
  • any function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when it runs on a computer, the computer executes the method provided by the embodiment of the present application.
  • An embodiment of the present application further provides a computer program product, where the computer program product includes a computer program that, when running on a computer, causes the computer to execute the method provided by the embodiment of the present application.
  • These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.
  • At least one of a, b, and c may represent: a, b, c, a and b, a and c, b and c or a and b and c, where a, b, c may be single, or Can be multiple.
  • the terms “comprising”, “comprising” or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, commodity or device including a series of elements not only includes those elements, but also includes Other elements not expressly listed, or which are inherent to such a process, method, article of manufacture, or apparatus are also included.
  • an element qualified by the phrase “comprising a" does not preclude the presence of additional identical elements in the process, method, article of manufacture, or device that includes the element.
  • the application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer.
  • program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
  • the application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located in both local and remote computer storage media including storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

La présente demande concerne un procédé et un appareil de lecture audio/vidéo, ainsi qu'un dispositif électronique. Le procédé de lecture audio/vidéo consiste à : reconnaître le scénario d'application actuel ; et selon le scénario d'application actuel, configurer une politique de lecture pour une lecture audio et/ou une lecture d'image vidéo, consistant à : lorsque le scénario d'application actuel est un scénario d'application interactive, configurer une politique de capture en tant que politique de lecture pour la lecture audio et/ou la lecture d'image vidéo. Selon le procédé des modes de réalisation de la présente demande, un scénario de lecture audio/vidéo est reconnu, et une politique de lecture correspondante est sélectionnée selon un scénario d'application spécifique, ce qui permet d'améliorer considérablement l'expérience utilisateur de lecture audio/vidéo.
PCT/CN2021/143557 2021-02-26 2021-12-31 Procédé et appareil de lecture audio/vidéo, et dispositif électronique WO2022179306A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110217016.8 2021-02-26
CN202110217016.8A CN114979783B (zh) 2021-02-26 2021-02-26 一种音视频播放方法、装置和电子设备

Publications (1)

Publication Number Publication Date
WO2022179306A1 true WO2022179306A1 (fr) 2022-09-01

Family

ID=82972806

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/143557 WO2022179306A1 (fr) 2021-02-26 2021-12-31 Procédé et appareil de lecture audio/vidéo, et dispositif électronique

Country Status (2)

Country Link
CN (1) CN114979783B (fr)
WO (1) WO2022179306A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116170622A (zh) * 2023-02-21 2023-05-26 阿波罗智联(北京)科技有限公司 音视频播放方法、装置、设备及介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080012985A1 (en) * 2006-07-12 2008-01-17 Quanta Computer Inc. System and method for synchronizing video frames and audio frames
CN106131655A (zh) * 2016-05-19 2016-11-16 安徽四创电子股份有限公司 一种基于实时视频的播放方法及平滑追赶播放方法
CN106790576A (zh) * 2016-12-27 2017-05-31 深圳市汇龙建通实业有限公司 一种互动桌面同步方法
CN107277558A (zh) * 2017-06-19 2017-10-20 网宿科技股份有限公司 一种实现直播视频同步的播放器客户端、系统及方法
CN107483972A (zh) * 2017-07-24 2017-12-15 平安科技(深圳)有限公司 一种音视频的直播处理方法、存储介质和一种移动终端
CN111372138A (zh) * 2018-12-26 2020-07-03 杭州登虹科技有限公司 一种播放器端的直播低延迟技术方案

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109144463B (zh) * 2018-08-14 2020-08-25 Oppo广东移动通信有限公司 传输控制方法、装置以及电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080012985A1 (en) * 2006-07-12 2008-01-17 Quanta Computer Inc. System and method for synchronizing video frames and audio frames
CN106131655A (zh) * 2016-05-19 2016-11-16 安徽四创电子股份有限公司 一种基于实时视频的播放方法及平滑追赶播放方法
CN106790576A (zh) * 2016-12-27 2017-05-31 深圳市汇龙建通实业有限公司 一种互动桌面同步方法
CN107277558A (zh) * 2017-06-19 2017-10-20 网宿科技股份有限公司 一种实现直播视频同步的播放器客户端、系统及方法
CN107483972A (zh) * 2017-07-24 2017-12-15 平安科技(深圳)有限公司 一种音视频的直播处理方法、存储介质和一种移动终端
CN111372138A (zh) * 2018-12-26 2020-07-03 杭州登虹科技有限公司 一种播放器端的直播低延迟技术方案

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116170622A (zh) * 2023-02-21 2023-05-26 阿波罗智联(北京)科技有限公司 音视频播放方法、装置、设备及介质

Also Published As

Publication number Publication date
CN114979783A (zh) 2022-08-30
CN114979783B (zh) 2024-04-09

Similar Documents

Publication Publication Date Title
JP5957760B2 (ja) 映像音声処理装置
US7822050B2 (en) Buffering, pausing and condensing a live phone call
JP5026167B2 (ja) ストリーム伝送サーバおよびストリーム伝送システム
CN106686438B (zh) 一种跨设备的音频图像同步播放的方法、装置及系统
CN113286184B (zh) 一种在不同设备上分别播放音频与视频的唇音同步方法
US9621949B2 (en) Method and apparatus for reducing latency in multi-media system
WO2020125153A1 (fr) Procédé de commande de lecture vidéo en réseau fluide reposant sur une technologie de diffusion multimédia en continu
US10582258B2 (en) Method and system of rendering late or early audio-video frames
JPWO2006082787A1 (ja) 記録再生装置および記録再生方法および記録再生プログラムを格納した記録媒体および記録再生装置において用いられる集積回路
JP2003114845A (ja) メディア変換方法およびメディア変換装置
CN101710997A (zh) 基于mpeg-2系统实现视、音频同步的方法及系统
CN112822502A (zh) 直播去抖动的智能缓存与直播方法、设备及存储介质
MX2011005782A (es) Metodo y aparato para controlar la reproduccion de datos de video-audio.
US20130166769A1 (en) Receiving device, screen frame transmission system and method
JP2002033771A (ja) メディアデータプロセッサ
WO2022179306A1 (fr) Procédé et appareil de lecture audio/vidéo, et dispositif électronique
KR100490403B1 (ko) 오디오 스트림의 버퍼링 제어 방법 및 그 장치
CN111352605A (zh) 一种音频播放、发送的方法及装置
JP2007095163A (ja) マルチメディア符号化データ分離伝送装置
CN116261001A (zh) 一种手术录播方法及系统
CN113613221B (zh) Tws主设备、tws从设备、音频设备及系统
CN114242067A (zh) 语音识别方法、装置、设备和存储介质
US20070248170A1 (en) Transmitting Apparatus, Receiving Apparatus, and Reproducing Apparatus
WO2016104178A1 (fr) Dispositif de traitement de signal, procédé de traitement de signal, et programme
CN113535115A (zh) 音频播放方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21927730

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21927730

Country of ref document: EP

Kind code of ref document: A1