CN113038193B - Method for automatically repairing asynchronous audio and video and display equipment - Google Patents

Method for automatically repairing asynchronous audio and video and display equipment Download PDF

Info

Publication number
CN113038193B
CN113038193B CN202110312267.4A CN202110312267A CN113038193B CN 113038193 B CN113038193 B CN 113038193B CN 202110312267 A CN202110312267 A CN 202110312267A CN 113038193 B CN113038193 B CN 113038193B
Authority
CN
China
Prior art keywords
audio
time
data
video
injection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110312267.4A
Other languages
Chinese (zh)
Other versions
CN113038193A (en
Inventor
汤小娜
杨依灿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vidaa Netherlands International Holdings BV
Vidaa USA Inc
Original Assignee
Vidaa Netherlands International Holdings BV
Vidaa USA Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vidaa Netherlands International Holdings BV, Vidaa USA Inc filed Critical Vidaa Netherlands International Holdings BV
Priority to CN202110312267.4A priority Critical patent/CN113038193B/en
Publication of CN113038193A publication Critical patent/CN113038193A/en
Application granted granted Critical
Publication of CN113038193B publication Critical patent/CN113038193B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4305Synchronising client clock from received content stream, e.g. locking decoder clock with encoder clock, extraction of the PCR packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/239Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
    • H04N21/2393Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests involving handling client requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47202End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting content on demand, e.g. video on demand

Abstract

The application discloses a method and display equipment for automatically repairing the asynchronous condition of audio and video, which are used for improving the asynchronous condition of the audio and video and enabling the audio and video to return to a synchronous state. The method comprises the following steps: receiving an instruction for playing audio and video data, and sending an audio data request and a video data request; determining the time of audio injection data according to the fed-back audio data; determining the time of video injection data according to the fed-back video data; decoding the audio data and the video data, and determining audio decoding data time and video decoding data time; when the audio decoding data time is greater than the video decoding data time, calculating the audio injection limit time according to the video decoding data time and the highest water level threshold value; when the audio injection data time is equal to the audio injection limit time, sending the audio data playing request is suspended until the difference between the audio injection limit time and the audio decoding data time is smaller than the lowest water level threshold.

Description

Method for automatically repairing asynchronous audio and video and display equipment
Technical Field
The application relates to the technical field of audio and video synchronization, in particular to a method for automatically repairing asynchronous audio and video and display equipment.
Background
In the related art, a user can watch audio and video on a display device, and illustratively watch a screening and communication through the display device. In an ideal case, the audio and video may be completely synchronized. However, in the actual audio/video playing process of the display device, since the audio output is linear and the video output may be nonlinear, the time consumed for decoding and rendering the audio and video may be different, and eventually, each frame output may have a slight gap, and long-term accumulation and asynchronous audio and video becomes more and more obvious.
Disclosure of Invention
The embodiment of the application provides a method for automatically repairing asynchronous audio and video and display equipment, which can improve user experience.
In a first aspect, there is provided a display device including:
a display for displaying a user interface;
a user interface for receiving an input signal;
a controller coupled to the display and the user interface, respectively, for performing:
receiving an instruction for playing audio and video data, and sending an audio data request and a video data request;
receiving feedback audio data and video data, and determining audio injection data time and video injection data time;
decoding the audio data and the video data, and determining audio decoding data time and video decoding data time;
when the audio decoding data time is greater than the video decoding data time, calculating the audio injection limit time according to the video decoding data time and the highest water level threshold value; when the audio injection data time is equal to the audio injection limit time, sending the audio data playing request is suspended until the difference between the audio injection limit time and the audio decoding data time is smaller than the lowest water level threshold.
In some embodiments, the controller is further configured to perform: when the audio decoding data time is not greater than the video decoding data time, the steps of transmitting the audio data request and the video data request are repeatedly performed.
In some embodiments, the controller is further configured to perform: when the audio injection data time is less than the audio injection limit time, the steps of transmitting the audio data request and the video data request are repeatedly performed.
In some embodiments, the calculating the audio injection limit time according to the video decoding time and the highest water level threshold value is calculated according to the following formula:
audio injection limit time = audio decoding data time + (highest water level threshold- (audio decoding data time-video decoding data time)).
In some embodiments, the highest water level threshold is 2s and the lowest water level threshold is 0.5s.
In a second aspect, a method for automatically repairing an audio/video asynchronous state is provided, including:
receiving an instruction for playing audio and video data, and sending an audio data request and a video data request; determining the time of audio injection data according to the fed-back audio data; determining the time of video injection data according to the fed-back video data;
decoding the audio data and the video data, and determining audio decoding data time and video decoding data time;
when the audio decoding data time is greater than the video decoding data time, calculating the audio injection limit time according to the video decoding data time and the highest water level threshold value; when the audio injection data time is equal to the audio injection limit time, sending the audio data playing request is suspended, only the video data request is sent, and the step of determining the video injection data time according to the fed-back video data is repeatedly executed.
In some embodiments, the method further comprises: when the audio decoding data time is not greater than the video decoding data time, the steps of transmitting the audio data request and the video data request are repeatedly performed.
In some embodiments, the method further comprises: when the audio injection data time is less than the audio injection limit time, the steps of transmitting the audio data request and the video data request are repeatedly performed.
In some embodiments, the calculating the audio injection limit time according to the video decoding time and the highest water level threshold value is calculated according to the following formula:
audio injection limit time = audio decoding data time + (highest water level threshold- (audio decoding data time-video decoding data time)).
In some embodiments, the highest water level threshold is 2s and the lowest water level threshold is 0.5s.
In the above embodiment, a method and a display device for automatically repairing an audio/video asynchronous state improve the situation of the audio/video asynchronous state and enable the audio/video to return to a synchronous state. The method comprises the following steps: receiving an instruction for playing audio and video data, and sending an audio data request and a video data request; determining the time of audio injection data according to the fed-back audio data; determining the time of video injection data according to the fed-back video data; decoding the audio data and the video data, and determining audio decoding data time and video decoding data time; when the audio decoding data time is greater than the video decoding data time, calculating the audio injection limit time according to the video decoding data time and the highest water level threshold value; when the audio injection data time is equal to the audio injection limit time, sending a play audio data request is suspended until the difference between the audio injection limit time and the audio decoding data time is less than the minimum water level threshold.
Drawings
FIG. 1 illustrates a usage scenario of a display device according to some embodiments;
fig. 2 shows a hardware configuration block diagram of the control apparatus 100 according to some embodiments;
fig. 3 illustrates a hardware configuration block diagram of a display device 200 according to some embodiments;
FIG. 4 illustrates a software configuration diagram in a display device 200 according to some embodiments;
a flowchart of a method of automatically repairing an audio-video dyssynchrony is shown schematically in fig. 5, in accordance with some embodiments;
the timeline of an audio-video according to some embodiments is shown schematically in fig. 6.
Detailed Description
For the purposes of making the objects and embodiments of the present application more apparent, an exemplary embodiment of the present application will be described in detail below with reference to the accompanying drawings in which exemplary embodiments of the present application are illustrated, it being apparent that the exemplary embodiments described are only some, but not all, of the embodiments of the present application.
It should be noted that the brief description of the terminology in the present application is for the purpose of facilitating understanding of the embodiments described below only and is not intended to limit the embodiments of the present application. Unless otherwise indicated, these terms should be construed in their ordinary and customary meaning.
The terms "first," second, "" third and the like in the description and in the claims and in the above drawings are used for distinguishing between similar or similar objects or entities and not necessarily for describing a particular sequential or chronological order, unless otherwise indicated. It is to be understood that the terms so used are interchangeable under appropriate circumstances.
The terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to all elements explicitly listed, but may include other elements not expressly listed or inherent to such product or apparatus.
The term "module" refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware or/and software code that is capable of performing the function associated with that element.
Fig. 1 is a schematic diagram of an operation scenario between a display device and a control apparatus according to an embodiment. As shown in fig. 1, a user may operate the display device 200 through the smart device 300 or the control apparatus 100.
In some embodiments, the control apparatus 100 may be a remote controller, and the communication between the remote controller and the display device includes infrared protocol communication or bluetooth protocol communication, and other short-range communication modes, and the display device 200 is controlled by a wireless or wired mode. The user may control the display device 200 by inputting user instructions through keys on a remote control, voice input, control panel input, etc.
In some embodiments, a smart device 300 (e.g., mobile terminal, tablet, computer, notebook, etc.) may also be used to control the display device 200. For example, the display device 200 is controlled using an application running on a smart device.
In some embodiments, the display device 200 may also perform control in a manner other than the control apparatus 100 and the smart device 300, for example, the voice command control of the user may be directly received through a module configured inside the display device 200 device for acquiring voice commands, or the voice command control of the user may be received through a voice control device configured outside the display device 200 device.
In some embodiments, the display device 200 is also in data communication with a server 400. The display device 200 may be permitted to make communication connections via a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display device 200. The server 400 may be a cluster, or may be multiple clusters, and may include one or more types of servers.
Fig. 2 exemplarily shows a block diagram of a configuration of the control apparatus 100 in accordance with an exemplary embodiment. As shown in fig. 2, the control device 100 includes a controller 110, a communication interface 130, a user input/output interface 140, a memory, and a power supply. The control apparatus 100 may receive an input operation instruction of a user and convert the operation instruction into an instruction recognizable and responsive to the display device 200, and function as an interaction between the user and the display device 200.
Fig. 3 shows a hardware configuration block diagram of the display device 200 in accordance with an exemplary embodiment.
In some embodiments, display apparatus 200 includes at least one of a modem 210, a communicator 220, a detector 230, an external device interface 240, a controller 250, a display 260, an audio output interface 270, memory, a power supply, a user interface.
In some embodiments the controller includes a processor, a video processor, an audio processor, a graphics processor, RAM, ROM, a first interface for input/output to an nth interface.
In some embodiments, the display 260 includes a display screen component for presenting a picture, and a driving component for driving an image display, for receiving image signals from the controller output, for displaying video content, image content, and a menu manipulation interface, and for manipulating a UI interface by a user.
In some embodiments, the display 260 may be a liquid crystal display, an OLED display, a projection device, and a projection screen.
In some embodiments, communicator 220 is a component for communicating with external devices or servers according to various communication protocol types. For example: the communicator may include at least one of a Wifi module, a bluetooth module, a wired ethernet module, or other network communication protocol chip or a near field communication protocol chip, and an infrared receiver. The display device 200 may establish transmission and reception of control signals and data signals with the external control device 100 or the server 400 through the communicator 220.
In some embodiments, the user interface may be configured to receive control signals from the control device 100 (e.g., an infrared remote control, etc.).
In some embodiments, the detector 230 is used to collect signals of the external environment or interaction with the outside. For example, detector 230 includes a light receiver, a sensor for capturing the intensity of ambient light; alternatively, the detector 230 includes an image collector such as a camera, which may be used to collect external environmental scenes, user attributes, or user interaction gestures, or alternatively, the detector 230 includes a sound collector such as a microphone, or the like, which is used to receive external sounds.
In some embodiments, the external device interface 240 may include, but is not limited to, the following: high Definition Multimedia Interface (HDMI), analog or data high definition component input interface (component), composite video input interface (CVBS), USB input interface (USB), RGB port, or the like. The input/output interface may be a composite input/output interface formed by a plurality of interfaces.
In some embodiments, the modem 210 receives broadcast television signals via wired or wireless reception and demodulates audio-video signals, such as EPG data signals, from a plurality of wireless or wired broadcast television signals.
In some embodiments, the controller 250 and the modem 210 may be located in separate devices, i.e., the modem 210 may also be located in an external device to the main device in which the controller 250 is located, such as an external set-top box or the like.
In some embodiments, the controller 250 controls the operation of the display device and responds to user operations through various software control programs stored on the memory. The controller 250 controls the overall operation of the display apparatus 200. For example: in response to receiving a user command to select a UI object to be displayed on the display 260, the controller 250 may perform an operation related to the object selected by the user command.
In some embodiments, the object may be any one of selectable objects, such as a hyperlink, an icon, or other operable control. The operations related to the selected object are: displaying an operation of connecting to a hyperlink page, a document, an image, or the like, or executing an operation of a program corresponding to the icon.
In some embodiments the controller includes at least one of a central processing unit (Central Processing Unit, CPU), video processor, audio processor, graphics processor (Graphics Processing Unit, GPU), RAM Random Access Memory, RAM), ROM (Read-Only Memory, ROM), first to nth interfaces for input/output, a communication Bus (Bus), and the like.
A CPU processor. For executing operating system and application program instructions stored in the memory, and executing various application programs, data and contents according to various interactive instructions received from the outside, so as to finally display and play various audio and video contents. The CPU processor may include a plurality of processors. Such as one main processor and one or more sub-processors.
In some embodiments, a graphics processor is used to generate various graphical objects, such as: icons, operation menus, user input instruction display graphics, and the like. The graphic processor comprises an arithmetic unit, which is used for receiving various interactive instructions input by a user to operate and displaying various objects according to display attributes; the device also comprises a renderer for rendering various objects obtained based on the arithmetic unit, wherein the rendered objects are used for being displayed on a display.
In some embodiments, the video processor is configured to receive an external video signal, perform video processing such as decompression, decoding, scaling, noise reduction, frame rate conversion, resolution conversion, image composition, etc., according to a standard codec protocol of an input signal, and may obtain a signal that is displayed or played on the directly displayable device 200.
In some embodiments, the video processor includes a demultiplexing module, a video decoding module, an image synthesis module, a frame rate conversion module, a display formatting module, and the like. The demultiplexing module is used for demultiplexing the input audio and video data stream. And the video decoding module is used for processing the demultiplexed video signal, including decoding, scaling and the like. And an image synthesis module, such as an image synthesizer, for performing superposition mixing processing on the graphic generator and the video image after the scaling processing according to the GUI signal input by the user or generated by the graphic generator, so as to generate an image signal for display. And the frame rate conversion module is used for converting the frame rate of the input video. And the display formatting module is used for converting the received frame rate into a video output signal and changing the video output signal to be in accordance with a display format, such as outputting RGB data signals.
In some embodiments, the audio processor is configured to receive an external audio signal, decompress and decode the audio signal according to a standard codec protocol of an input signal, and perform noise reduction, digital-to-analog conversion, and amplification processing to obtain a sound signal that can be played in a speaker.
In some embodiments, a user may input a user command through a Graphical User Interface (GUI) displayed on the display 260, and the user input interface receives the user input command through the Graphical User Interface (GUI). Alternatively, the user may input the user command by inputting a specific sound or gesture, and the user input interface recognizes the sound or gesture through the sensor to receive the user input command.
In some embodiments, a "user interface" is a media interface for interaction and exchange of information between an application or operating system and a user that enables conversion between an internal form of information and a form acceptable to the user. A commonly used presentation form of the user interface is a graphical user interface (Graphic User Interface, GUI), which refers to a user interface related to computer operations that is displayed in a graphical manner. It may be an interface element such as an icon, a window, a control, etc. displayed in a display screen of the electronic device, where the control may include a visual interface element such as an icon, a button, a menu, a tab, a text box, a dialog box, a status bar, a navigation bar, a Widget, etc.
In some embodiments, a system of display devices may include a Kernel (Kernel), a command parser (shell), a file system, and an application program. The kernel, shell, and file system together form the basic operating system architecture that allows users to manage files, run programs, and use the system. After power-up, the kernel is started, the kernel space is activated, hardware is abstracted, hardware parameters are initialized, virtual memory, a scheduler, signal and inter-process communication (IPC) are operated and maintained. After the kernel is started, shell and user application programs are loaded again. The application program is compiled into machine code after being started to form a process.
As shown in fig. 4, a system of display devices may include a Kernel (Kernel), a command parser (shell), a file system, and an application program. The kernel, shell, and file system together form the basic operating system architecture that allows users to manage files, run programs, and use the system. After power-up, the kernel is started, the kernel space is activated, hardware is abstracted, hardware parameters are initialized, virtual memory, a scheduler, signal and inter-process communication (IPC) are operated and maintained. After the kernel is started, shell and user application programs are loaded again. The application program is compiled into machine code after being started to form a process.
As shown in fig. 4, the system of the display device is divided into three layers, an application layer, a middleware layer, and a hardware layer, from top to bottom.
The application layer mainly comprises common applications on the television, and an application framework (Application Framework), wherein the common applications are mainly applications developed based on Browser, such as: HTML5 APPs; native applications (Native APPs);
the application framework (Application Framework) is a complete program model with all the basic functions required by standard application software, such as: file access, data exchange …, and the interface for the use of these functions (toolbar, status column, menu, dialog box).
Native applications (Native APPs) may support online or offline, message pushing, or local resource access.
The middleware layer includes middleware such as various television protocols, multimedia protocols, and system components. The middleware can use basic services (functions) provided by the system software to connect various parts of the application system or different applications on the network, so that the purposes of resource sharing and function sharing can be achieved.
The hardware layer mainly comprises a HAL interface, hardware and a driver, wherein the HAL interface is a unified interface for all the television chips to be docked, and specific logic is realized by each chip. The driving mainly comprises: audio drive, display drive, bluetooth drive, camera drive, WIFI drive, USB drive, HDMI drive, sensor drive (e.g., fingerprint sensor, temperature sensor, pressure sensor, etc.), and power supply drive, etc.
In the related art, a user can watch audio and video on a display device, and illustratively watch a screening and communication through the display device. In an ideal case, the audio and video may be completely synchronized. However, in the actual audio/video playing process of the display device, since the audio output is linear and the video output may be nonlinear, the time consumed for decoding and rendering the audio and video may be different, and eventually, each frame output may have a slight gap, and long-term accumulation and asynchronous audio and video becomes more and more obvious.
In order to solve the above technical problems, an embodiment of the present application provides a method for automatically repairing an audio/video asynchronous, as shown in fig. 5, the method includes:
s100, receiving an instruction for playing audio and video data, and sending an audio data request and a video data request. In the embodiment of the application, the instruction of playing the audio and video data can be completed by pressing the confirmation key on the control device by the user, and the selection and communication control is displayed on the display interface by way of example, the user moves the selector to the selection and communication control by the control device and presses the confirmation key on the control device, so that the instruction of playing the audio and video data is generated.
The display device may be equipped with a video application, for example, a YouTube video application, and may play video through a YouYube video application. The YouTube video application needs to use a browser as a carrier of the video application, for example, the browser may be a cobalt browser. In the embodiment of the application, a playing architecture in a display device system comprises a browser layer, a middle layer and a player layer. The browser layer is the source of audio and video data. The middle layer may process audio and video data, including sending audio and video data requests to the browser layer, injecting data to the player layer, and so forth. The player layer is used for decrypting, decoding, synchronizing and playing the decoded audio and video data.
In the embodiment of the application, the instruction for playing audio and video data is received, and the middle layer sends an audio data request and a video data request to the browser layer. Ideally, the audio and video can be synchronized, and the sending of the audio data request and the video data request can be performed synchronously at all times. However, the time consumed for decoding and rendering the video and the audio in actual playing is different, and finally, the audio and the video can be played out of synchronization to influence the watching of the user. Therefore, the embodiment of the application improves the phenomenon that the audio and video play are different by controlling the situation of sending the audio data request and the video data request.
S200, receiving the fed-back audio data and video data, and determining the audio injection data time and the video injection data time. In the embodiment of the application, after sending the audio data request and the video data request, the middle layer receives the fed back audio data and video data and determines the time of audio injection data and the time of video injection data.
The audio injection data time refers to the time corresponding to the received audio data. The audio data is sent from the browser layer to the middle layer, the middle layer can analyze the time corresponding to the audio data, and when screening is played, the audio data sent to the middle layer at this time is 12s audio data of the screening and transmitting first set, and the audio injection data time is 12s. Similarly, the video injection data time can also be obtained by analyzing video data by using an intermediate layer.
S300, decoding the audio data and the video data, and determining the audio decoding data time and the video decoding data time. In some embodiments, the audio data and the video data are decoded by the player layer, and a time stamp corresponding to the currently decoded audio data is used as the audio decoding data time. The corresponding time stamp of the currently decoded video data is taken as the video decoding data time. It should be explained that since audio-video-audio data transmitted to the intermediate layer by the browser layer need to be decoded one by one, the audio decoding data time and the audio injection data time are not the same. For example, the audio injection data time may be 12s, but at this time the audio is not completely decoded, and the audio decoding data time may be 10s. Similarly, the video decoding data time and the video injection data time are different.
In the embodiment of the application, the audio data and the video data are played after being decoded, so the audio decoding data time can be understood as the current audio playing progress, and the video decoding data time can be understood as the current video playing progress. In the embodiment of the application, the video output is nonlinear, the audio output is linear, the video decoding time is slow, and the audio decoding time is relatively fast, so that the progress of video playing is taken as the content on the progress bar displayed on the display.
S400, judging the size of the audio decoding data time and the video decoding data time.
S500, when the audio decoding data time is greater than the video decoding data time, calculating the audio injection limit time according to the video decoding data time and the highest water level threshold value. In the embodiment of the application, when the time of audio decoding data is not less than the time of video decoding data, the phenomenon that the audio and video playing is asynchronous is indicated, the audio decoding is fast, the video decoding is slow, and the situation that the audio and video are asynchronous is aggravated if corresponding processing is not performed. In the embodiment of the application, the audio injection limit time is utilized to control the request for sending the audio data.
In order to avoid the aggravation of the asynchronous condition of the audio and the video, the embodiment of the application limits the audio injection data time, reduces the difference between the audio injection data time and the video injection data time, and further repairs the asynchronous condition of the audio and the video.
In some embodiments, as shown in fig. 6, a timeline for video and a timeline for audio are shown in fig. 6. In fig. 6, the video decoding data time is smaller than the audio decoding data time, in which case the audio and video are not synchronized. After the calculation of the highest water level threshold value, it can be seen that the video injection data time and the original video injection data time still have a larger difference, which can lead to the situation that the video and the audio are not synchronous to be improved. The embodiment of the application is provided with the audio injection limit time, and the audio injection data time is controlled at the audio injection limit time, so that the difference between the video injection data time and the audio injection data time can be reduced, and the condition of asynchronous audio and video is improved.
The audio injection limit time is calculated according to the video decoding time and the highest water level threshold value, and the audio injection limit time is calculated according to the following formula: audio injection limit time = audio decoding data time + (highest water level threshold- (audio decoding data time-video decoding data time)). In the embodiment of the present application, the highest water level threshold is the highest time difference between the audio injection data time and the audio decoding data time, and the time difference between the video injection data time and the video decoding data time. In some embodiments, the highest water level threshold may be 2s and the lowest water level threshold may be 0.5s. Illustratively, the video decoding data time is 10s and the audio injection limit time is 12s.
S500, judging the audio injection data time and the audio injection limit time.
And S600, when the audio injection data time is equal to the audio injection limit time, sending the audio data playing request is suspended until the difference value of the audio injection limit time and the audio decoding data time is smaller than the lowest water level threshold. In the embodiment of the application, when the audio injection data time reaches the audio injection limit time, the audio data playing request is not sent any more, and if the audio data playing request is continuously sent, the asynchronous condition of the audio and the video is only aggravated. Therefore, the embodiment of the application only sends the video data request, so that the difference between the video injection data time and the audio injection data time is continuously reduced, and the player synchronizes the audio and video after decoding the audio and video data so as to repair the unsynchronized state.
In some embodiments, the method further comprises: and S700, when the audio decoding data time is not more than the video decoding data time, repeating the steps of sending the audio data request and the video data request. In the embodiment of the application, when the audio decoding data time is not longer than the video decoding data time, although the audio playing and the video playing are asynchronous, the audio output is nonlinear, the video decoding is slow, and the audio decoding is fast, but the application mainly uses the video decoding data time, so that the video data request is not limited, and after the proper time, the audio playing progress can catch up with the video playing progress, so that only the steps of sending the audio data request and the video data request need to be repeatedly executed.
In some embodiments, the method further comprises: s800, when the audio injection data time is less than the audio injection limit time, repeating the steps of sending the audio data request and the video data request. In the embodiment of the application, the audio injection data time is smaller than the audio injection limit time, which means that the audio injection data time can be accepted now, so that the audio data request is not required to be limited.
In the above embodiment, a method and a display device for automatically repairing an audio/video asynchronous state improve the situation of the audio/video asynchronous state and enable the audio/video to return to a synchronous state. The method comprises the following steps: receiving an instruction for playing audio and video data, and sending an audio data request and a video data request; determining the time of audio injection data according to the fed-back audio data; determining the time of video injection data according to the fed-back video data; decoding the audio data and the video data, and determining audio decoding data time and video decoding data time; when the audio decoding data time is greater than the video decoding data time, calculating the audio injection limit time according to the video decoding data time and the highest water level threshold value; when the audio injection data time is equal to the audio injection limit time, sending a play audio data request is suspended until the difference between the audio injection limit time and the audio decoding data time is less than the minimum water level threshold.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.
The foregoing description, for purposes of explanation, has been presented in conjunction with specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the embodiments to the precise forms disclosed above. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles and the practical application, to thereby enable others skilled in the art to best utilize the embodiments and various embodiments with various modifications as are suited to the particular use contemplated.

Claims (8)

1. A display device, characterized by comprising:
a display for displaying a user interface;
a user interface for receiving an input signal;
a controller coupled to the display and the user interface, respectively, for performing:
receiving an instruction for playing audio and video data, and sending an audio data request and a video data request;
receiving feedback audio data and video data, and determining audio injection data time and video injection data time, wherein the audio injection data time refers to time corresponding to the received audio data, and the video injection data time refers to time corresponding to the received video data;
decoding the audio data and the video data, and determining audio decoding data time and video decoding data time, wherein the audio decoding data time refers to a time stamp corresponding to the decoded audio data, and the video decoding data time refers to a time stamp corresponding to the decoded video data;
when the audio decoding data time is greater than the video decoding data time, calculating the audio injection limit time according to the video decoding data time and a highest water level threshold, wherein the highest water level threshold is a preset highest time difference value between the audio injection data time and the audio decoding data time; the calculation formula of the audio injection limit time is as follows: audio injection limit time = audio decoding data time+ (highest water level threshold- (audio decoding data time-video decoding data time));
when the audio injection data time is equal to the audio injection limit time, sending a request for playing the audio data is suspended until the difference between the audio injection limit time and the audio decoding data time is smaller than a minimum water level threshold, wherein the minimum water level threshold is preset as the minimum time difference between the audio injection data time and the audio decoding data time which can be accepted.
2. The display device of claim 1, wherein the controller is further configured to perform: when the audio decoding data time is not greater than the video decoding data time, the steps of transmitting the audio data request and the video data request are repeatedly performed.
3. The display device of claim 1, wherein the controller is further configured to perform: when the audio injection data time is less than the audio injection limit time, the steps of transmitting the audio data request and the video data request are repeatedly performed.
4. The display device of claim 1, wherein the highest water level threshold is 2s and the lowest water level threshold is 0.5s.
5. A method for automatically repairing an audio-video dyssynchrony, comprising:
receiving an instruction for playing audio and video data, and sending an audio data request and a video data request; determining the time of audio injection data according to the fed-back audio data; determining video injection data time according to the fed-back video data, wherein the audio injection data time refers to the time corresponding to the received audio data, and the video injection data time refers to the time corresponding to the received video data;
decoding the audio data and the video data, and determining audio decoding data time and video decoding data time, wherein the audio decoding data time refers to a time stamp corresponding to the decoded audio data, and the video decoding data time refers to a time stamp corresponding to the decoded video data;
when the audio decoding data time is greater than the video decoding data time, calculating the audio injection limit time according to the video decoding data time and a highest water level threshold, wherein the highest water level threshold is a preset highest time difference value between the audio injection data time and the audio decoding data time; the calculation formula of the audio injection limit time is as follows: audio injection limit time = audio decoding data time+ (highest water level threshold- (audio decoding data time-video decoding data time));
when the audio injection data time is equal to the audio injection limit time, sending a request for playing the audio data is suspended until the difference between the audio injection limit time and the audio decoding data time is smaller than a minimum water level threshold, wherein the minimum water level threshold is preset as the minimum time difference between the audio injection data time and the audio decoding data time which can be accepted.
6. The method of claim 5, wherein the method further comprises: when the audio decoding data time is not greater than the video decoding data time, the steps of transmitting the audio data request and the video data request are repeatedly performed.
7. The method of claim 5, wherein the method further comprises: when the audio injection data time is less than the audio injection limit time, the steps of transmitting the audio data request and the video data request are repeatedly performed.
8. The method of claim 5, wherein the highest water level threshold is 2s and the lowest water level threshold is 0.5s.
CN202110312267.4A 2021-03-24 2021-03-24 Method for automatically repairing asynchronous audio and video and display equipment Active CN113038193B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110312267.4A CN113038193B (en) 2021-03-24 2021-03-24 Method for automatically repairing asynchronous audio and video and display equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110312267.4A CN113038193B (en) 2021-03-24 2021-03-24 Method for automatically repairing asynchronous audio and video and display equipment

Publications (2)

Publication Number Publication Date
CN113038193A CN113038193A (en) 2021-06-25
CN113038193B true CN113038193B (en) 2023-08-11

Family

ID=76473130

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110312267.4A Active CN113038193B (en) 2021-03-24 2021-03-24 Method for automatically repairing asynchronous audio and video and display equipment

Country Status (1)

Country Link
CN (1) CN113038193B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102868939A (en) * 2012-09-10 2013-01-09 杭州电子科技大学 Method for synchronizing audio/video data in real-time video monitoring system
CN104902316A (en) * 2015-05-14 2015-09-09 广东欧珀移动通信有限公司 Method and device for synchronous playing of time, intelligent sound box, and mobile terminal
CN105979347A (en) * 2015-12-03 2016-09-28 乐视致新电子科技(天津)有限公司 Video play method and device
WO2020024950A1 (en) * 2018-08-01 2020-02-06 北京微播视界科技有限公司 Video recording method and device
WO2020155964A1 (en) * 2019-01-30 2020-08-06 上海哔哩哔哩科技有限公司 Audio/video switching method and apparatus, and computer device and readable storage medium
CN111601135A (en) * 2020-05-09 2020-08-28 青岛海信传媒网络技术有限公司 Method for synchronously injecting audio and video elementary streams and display equipment
CN112153447A (en) * 2020-09-27 2020-12-29 海信视像科技股份有限公司 Display device and sound and picture synchronous control method
CN112153446A (en) * 2020-09-27 2020-12-29 海信视像科技股份有限公司 Display equipment and streaming media video audio-video synchronization method
CN112533056A (en) * 2019-09-17 2021-03-19 海信视像科技股份有限公司 Display device and sound reproduction method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011066571A (en) * 2009-09-16 2011-03-31 Toshiba Corp Video-audio playback apparatus
US11288033B2 (en) * 2019-04-09 2022-03-29 Hisense Visual Technology Co., Ltd. Method for outputting audio data of applications and display device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102868939A (en) * 2012-09-10 2013-01-09 杭州电子科技大学 Method for synchronizing audio/video data in real-time video monitoring system
CN104902316A (en) * 2015-05-14 2015-09-09 广东欧珀移动通信有限公司 Method and device for synchronous playing of time, intelligent sound box, and mobile terminal
CN105979347A (en) * 2015-12-03 2016-09-28 乐视致新电子科技(天津)有限公司 Video play method and device
WO2020024950A1 (en) * 2018-08-01 2020-02-06 北京微播视界科技有限公司 Video recording method and device
WO2020155964A1 (en) * 2019-01-30 2020-08-06 上海哔哩哔哩科技有限公司 Audio/video switching method and apparatus, and computer device and readable storage medium
CN112533056A (en) * 2019-09-17 2021-03-19 海信视像科技股份有限公司 Display device and sound reproduction method
CN111601135A (en) * 2020-05-09 2020-08-28 青岛海信传媒网络技术有限公司 Method for synchronously injecting audio and video elementary streams and display equipment
CN112153447A (en) * 2020-09-27 2020-12-29 海信视像科技股份有限公司 Display device and sound and picture synchronous control method
CN112153446A (en) * 2020-09-27 2020-12-29 海信视像科技股份有限公司 Display equipment and streaming media video audio-video synchronization method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
网络教学中基于客户端的流媒体同步控制的研究;郜亚丽等;《湖南工业职业技术学院学报》;20101228(第06期);全文 *

Also Published As

Publication number Publication date
CN113038193A (en) 2021-06-25

Similar Documents

Publication Publication Date Title
CN113630649B (en) Display equipment and video playing progress adjusting method
CN112672195A (en) Remote controller key setting method and display equipment
CN112653906B (en) Video hot spot playing method on display equipment and display equipment
CN112887778A (en) Switching method of video resource playing modes on display equipment and display equipment
CN113163258A (en) Channel switching method and display device
CN113490024A (en) Control device key setting method and display equipment
CN112799576A (en) Virtual mouse moving method and display device
CN113111214A (en) Display method and display equipment for playing records
CN112911371B (en) Dual-channel video resource playing method and display equipment
CN113014977B (en) Display device and volume display method
CN113038193B (en) Method for automatically repairing asynchronous audio and video and display equipment
CN113784203A (en) Display device and channel switching method
CN112732396A (en) Media asset data display method and display device
CN113490030A (en) Display device and channel information display method
CN113573112A (en) Display device and remote controller
CN112882780A (en) Setting page display method and display device
CN113064691A (en) Display method and display equipment for starting user interface
CN113490041B (en) Voice function switching method and display device
CN113784222B (en) Interaction method of application and digital television program and display equipment
CN113490013B (en) Server and data request method
CN112883302B (en) Method for displaying page corresponding to hyperlink address and display equipment
CN113676782B (en) Display equipment and interaction method for coexisting multiple applications
CN113766164B (en) Display equipment and signal source interface display method
CN113190202B (en) Data display method and display equipment
CN113038221B (en) Double-channel video playing method and display equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant