CN115643437A

CN115643437A - Display device and video data processing method

Info

Publication number: CN115643437A
Application number: CN202211124768.0A
Authority: CN
Inventors: 邢芳; 李斌
Original assignee: Hisense Visual Technology Co Ltd
Current assignee: Hisense Visual Technology Co Ltd
Priority date: 2022-09-15
Filing date: 2022-09-15
Publication date: 2023-01-24

Abstract

The disclosure relates to a display device and a video data processing method, and relates to the technical field of screen projection. The display device includes a controller configured to acquire a target video data packet to be parsed and parse the target video data packet; under the condition that the target video data packet comprises target identification information, taking a target analysis result as analysis data of a current video frame; the target identification information is used for indicating that the target video data packet is the last data packet of the current video frame; the target parsing result includes parsing results of all video data packets included between the first video data packet of the current video frame and the target video data packet. The embodiment of the disclosure is used for reducing screen projection delay.

Description

Display device and video data processing method

Technical Field

The present disclosure relates to the field of display technologies, and in particular, to a display device and a video data processing method.

Background

In order to meet the requirements of multi-screen interaction and realize the screen sharing function, the screen of the mobile phone end is enlarged and displayed on the equipment, the daily experience of watching the video, game interaction and the like of a user is enhanced, and the wireless screen-casting/same-screen technology is mature. The Wireless display standard based on Wireless local area network (Wi-Fi) direct connection is used in the Mircast Wireless interconnection technology, and essentially, a large number of code streams coded in real time are sent to a television end by a screen recording code of a mobile phone end.

When parsing video stream data in the Mircast screen technology, a plurality of data packets are usually parsed into one frame of video data, and under the condition that the data packets are continuously transmitted, whether parsing of the data packet corresponding to one frame of video data is completed needs to be judged. At present, a start code (start code) is carried in a first data packet corresponding to a frame of video data to indicate that the data packet is the first data packet where the frame of video data starts, so that when it is determined in an analysis process whether the analysis of the data packet corresponding to a frame of video data is completed, it is required to know that the data packet corresponding to the previous frame of video data has been completely analyzed through the start code (start code) carried in the first data packet of the next frame of video data after the first data packet corresponding to the next frame of video data is received. In this way, the way of determining whether the data packet corresponding to one frame of video data is analyzed is completed, because the previous frame of video data needs to be cached before the first data packet corresponding to the next frame of video data is received. Therefore, the video data can not be processed in time due to too long buffer time, and the screen projection delay is too large.

Disclosure of Invention

In order to solve the technical problems or at least partially solve the technical problems, the disclosure provides a display device and a video data processing method, which can improve the low-delay effect of screen projection and improve the video playing experience of the screen projection process of a large number of users.

In order to achieve the above purpose, the technical solutions provided by the embodiments of the present disclosure are as follows:

in a first aspect, there is provided a display device comprising:

a controller configured to: acquiring a target video data packet to be analyzed, and analyzing the target video data packet;

under the condition that the target video data packet comprises target identification information, taking a target analysis result as analysis data of the current video frame;

the target identification information is used for indicating that the target video data packet is the last data packet of the current video frame; the target parsing result includes parsing results of all video data packets included between the first video data packet of the current video frame and the target video data packet.

In a second aspect, a video data processing method is provided, including:

acquiring a target video data packet to be analyzed, and analyzing the target video data packet;

under the condition that the target video data packet comprises target identification information, taking a target analysis result as analysis video data of the current video frame;

In a third aspect, the present disclosure provides a computer-readable storage medium comprising: the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the video data processing method as shown in the second aspect.

In a fourth aspect, the present disclosure provides a computer program product comprising a computer program which, when run on a computer, causes the computer to implement the video data processing method as shown in the second aspect.

As can be seen from the foregoing technical solutions, according to the display device and the video data processing method provided in the embodiments of the present disclosure, when an acquired video data packet is parsed, whether a received video data packet is a last data packet of a current video frame may be determined by determining whether target identification information (which is used to indicate that the video data packet is a last data packet of the current video frame) is included in the parsed video data packet (which may be a target video data packet), and when it is determined that the parsed video data packet includes the target identification information, it indicates that all data packets of the current video frame have been parsed, and at this time, parsing results of all video data packets included between a first video data packet of the current video frame and the target video data packet may be determined as parsing data of the current video frame. In the parsing process, under the condition that the last data packet of the current video frame is received, the target identification information can be known in time, so that the parsed data of the current video frame can be obtained in time to perform subsequent decoding and rendering processes.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present disclosure, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

Fig. 1 is an architectural diagram of an application scenario in some embodiments provided by embodiments of the present disclosure;

fig. 2 is a block diagram of a configuration of the control device 100 in some embodiments provided by the present disclosure;

fig. 3 is a block diagram of a hardware configuration of a display device 200 in some embodiments provided by embodiments of the present disclosure;

fig. 4 is a block diagram of a configuration of a terminal device 300 in some embodiments provided by the embodiments of the present disclosure;

fig. 5 is a schematic software configuration diagram of a display device 200 according to an embodiment of the present disclosure;

fig. 6 is a flowchart of a video data transmission method according to some embodiments of the present disclosure;

fig. 7 is a scene schematic diagram of a video data transmission method according to an embodiment of the disclosure;

fig. 8 is a schematic diagram illustrating a first video data packet of a video frame being determined according to an embodiment of the disclosure;

fig. 9 is a schematic diagram of another example of determining a first video data packet of a video frame according to the present disclosure;

fig. 10 is a schematic diagram of a decoding manner in a video data transmission method according to an embodiment of the present disclosure and provided in fig. 6;

fig. 11 is a second flowchart illustrating steps of a video data transmission method according to some embodiments of the present disclosure;

fig. 12 is a third flowchart illustrating steps of a video data transmission method according to some embodiments of the present disclosure;

fig. 13 is a schematic diagram of a video data processing method according to an embodiment of the present disclosure, based on the method shown in fig. 12;

fig. 14 discloses a flow chart of the steps of a video data transmission method based on that provided in fig. 12;

fig. 15 is another schematic view of a scenario based on the data transmission methods in fig. 11 and 12 according to some embodiments of the present disclosure.

Detailed Description

In order that the above-mentioned objects, features and advantages of the present application may be more clearly understood, the solution of the present application will be further described below. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, but the present application may be practiced other than as described herein; it is to be understood that the embodiments described in this specification are only some embodiments of the present application and not all embodiments.

It should be noted that the brief descriptions of the terms in the present application are only for convenience of understanding of the embodiments described below, and are not intended to limit the embodiments of the present application. These terms should be understood in their ordinary and customary meaning unless otherwise indicated.

The related art includes various wireless screen projection technologies. The Digital Living Network Alliance (DLNA) and the Mircast are mainly used for playing in the air (AirPlay), digital Living Network Alliance (DLNA), and the like.

The AirPlay is a wireless technology developed by apple, and data (for example, pictures, audio, and video) in an Inter-network Operating System (IOS) device developed by apple can be wirelessly transmitted to an AirPlay-supporting device through Wi-Fi. Since the AirPlay can only be applied to devices developed by apple, the application range is limited, and the AirPlay is difficult to be applied to other devices in an expanded way.

DLNA is a set of protocols for interconnection and interworking among personal computers, mobile devices, and consumer appliances, which are initiated by sony, intel, microsoft, etc. The screen projection picture of the terminal equipment of the protocol cannot be displayed on the display equipment in real time, namely the protocol does not have the mirror image screen projection function.

Based on the problems of AirPlay and DLNA, the problems can be solved through Mircast in the related technology, the Mircast wireless interconnection technology has a mirror image screen projection function, the equipment support is wide, and the screen projection playing effect can be well realized.

However, in the process of screen projection by using Mircast, when a video data packet is parsed, a plurality of data packets are usually parsed into one frame of video data, and under the condition that the data packets are continuously transmitted, it is necessary to determine whether the parsing of the data packet corresponding to one frame of video data is completed. At present, a start code (start code) is carried in a first data packet corresponding to a frame of video data to indicate that the data packet is the first data packet from which the frame of video data starts, so that when it is determined in an analysis process whether the data packet corresponding to a frame of video data is analyzed, it is required to know that the data packet corresponding to the previous frame of video data is analyzed through the start code carried in the first data packet in the next frame of video data after the first data packet corresponding to the next frame of video data is received. In this way, the way of determining whether the data packet corresponding to one frame of video data is analyzed is completed, because the previous frame of video data needs to be cached before the first data packet corresponding to the next frame of video data is received. Therefore, the video data can not be processed in time due to too long buffer time, and the screen projection delay is too large.

In order to solve the problem of too large screen-casting delay, an embodiment of the present disclosure provides a display device and a video data processing method, where in a case where a last data packet of a current video frame is received, target identification information (where the target identification information is used to indicate that a video data packet is the last data packet of the current video frame) may be obtained in time, so that parsing data of the current video frame may be obtained in time to perform subsequent decoding and rendering processes.

Fig. 1 is a schematic diagram of an architecture of an application scenario in some embodiments provided by the embodiments of the present disclosure.

Illustratively, as shown in fig. 1, an architecture of an application scenario provided in an embodiment of the present application includes: control device 100, display device 200, terminal device 300, and server 400.

As shown in fig. 1, a user may operate the display device 200 through the terminal device 300 or the control apparatus 100, and may transmit a video stream to the display device 200 through the terminal device 300 to control the display device 200 to perform a corresponding operation, and perform video playing according to the video stream transmitted by the display device 200. The display device provided by the embodiment of the application can have various implementation forms, for example, a television, a smart speaker refrigerator with a display function, a curtain with a display function, a Personal Computer (PC), a laser projection device, a display, an electronic whiteboard, a wearable device, an in-vehicle device, an electronic desktop, and the like can be used.

In some embodiments, the control apparatus 100 may be a remote controller, and the communication between the remote controller and the display device 200 includes an infrared protocol communication or a bluetooth protocol communication, and other short-distance communication methods, and controls the display device 200 in a wireless or wired manner. The user may input a user instruction through a key on a remote controller, voice input, control panel input, etc., to control the display apparatus 200.

In some embodiments, the terminal device 300 (e.g., mobile terminal, tablet, computer, notebook, etc.) may also be used to control the display device 200. For example, the display device 200 is controlled using an application program running on the smart device.

In some embodiments, the screen projection of the terminal device 300 and the display device 200 controls the video stream through a Real Time Streaming Protocol (RTSP), and the Streaming uses a Real-Time Transmission Protocol (RTP) loaded by a User Datagram Protocol (UDP), so that a User can project a screen to the display device 200 in Real Time through the RTP.

In some embodiments, the terminal device 300 may install a software application with the display device 200, implement screen projection through a network communication protocol, and implement the purpose of one-to-one real-time screen sharing. The audio and video content displayed on the terminal device 300 can also be transmitted to the display device 200, so as to realize the synchronous display function. The display device 200 may be allowed to be communicatively connected through a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display apparatus 200. The display device 200 may be a liquid crystal display, an OLED display, a projection display device.

Fig. 2 is a block diagram of a configuration of the control device 100 in some embodiments provided by the embodiments of the present disclosure.

Illustratively, as shown in fig. 2, the control device 100 includes a controller 110, a communication interface 130, a user input/output interface 140, a memory, and a power supply. The control apparatus 100 may receive an input operation instruction from a user and convert the operation instruction into an instruction recognizable and responsive by the display device 200, serving as an interaction intermediary between the user and the display device 200. The communication interface 130 is used for external communication and includes at least one of a Wi-Fi chip, a bluetooth module, NFC, or an alternative module. The user input/output interface 140 includes at least one of a microphone, a touch pad, a sensor, a key, or an alternative module.

Fig. 3 is a block diagram of a hardware configuration of a display device 200 in some embodiments provided by the embodiments of the present disclosure.

Illustratively, the display device 200 shown in fig. 3 includes: a tuner demodulator 210, a communicator 220, a detector 230, an external device interface 240, a controller 250, a display 260, an audio output interface 270, a memory, a power supply, and the like.

The controller 250 includes a central processing unit, a video processor, an audio processor, a graphic processor, a RAM, a ROM, a first interface to an nth interface for input/output, among others. The display 260 may be at least one of a liquid crystal display, an OLED display, a touch display, and a projection display, and may also be a projection device and a projection screen. The tuner demodulator 210 receives a broadcast television signal through wired or wireless reception and demodulates an audio/video signal, such as an EPG data signal, from a plurality of wireless or wired broadcast television signals. The detector 230 is used to collect signals of the external environment or interaction with the outside. The controller 250 and the tuner-demodulator 210 may be located in different separate devices, that is, the tuner-demodulator 210 may also be located in an external device of the main device where the controller 250 is located, such as an external set-top box.

In some embodiments, the display device is a terminal device with a display function, such as a television, a mobile phone, a computer, a learning machine, and the like.

In some embodiments, the controller 250 controls the operation of the display device and responds to user operations through various software control programs stored in memory. The controller 250 controls the overall operation of the display apparatus 200. The user may input a user command through a Graphical User Interface (GUI) displayed on the display 260, and the user input interface receives the user input command through the Graphical User Interface (GUI). Alternatively, the user may input the user command by inputting a specific sound or gesture, and the user input interface receives the user input command by recognizing the sound or gesture through the sensor.

An output interface (display 260, and/or audio output interface 270) configured to output user interaction information;

a user may input a user command on a Graphical User Interface (GUI) displayed on the display 260, and the user interface receives the user input command through the Graphical User Interface (GUI). Alternatively, the user may input a user command by inputting a specific sound or gesture, and the user interface receives the user input command by recognizing the sound or gesture through the sensor.

A communicator 220 for communicating with the server 400 or other devices.

In some embodiments, the controller 250 may obtain a target video data packet to be parsed and parse the target video data packet; under the condition that the target video data packet comprises target identification information, taking a target analysis result as analysis data of a current video frame; the target identification information is used for indicating that the target video data packet is the last data packet of the current video frame; the target parsing result includes parsing results of all video data packets included between the first video data packet of the current video frame and the target video data packet.

Wherein, the controller 250 is specifically configured to: under the condition that the target video data packet comprises target identification information, taking a target analysis result as analysis data of a current video frame, wherein the analysis data comprises the following steps: and under the condition that the first bit of the 2 nd byte in the header of the target video data packet is 1, determining that the header of the target video data packet comprises target identification information, and taking the target analysis result as analysis data.

In some embodiments, after the controller 250 uses the target video data parsing result as the parsing data of the current video frame, the controller 250 decodes the parsing data of the current video after the parsing data of the previous video frame is decoded, so as to obtain the decoded data corresponding to the current video frame.

Before the decoder decodes, the decoding mode of the decoder is set to be a frame-in-frame-out mode, so that decoded data of the decoded video frames can be displayed on the display 260 in the display device 200 as soon as possible, and the time spent on screen projection is reduced.

In some embodiments, after acquiring the decoded data of the current video frame, the controller 250 acquires a first rendering time at which the decoded data corresponding to the previous video frame is rendered; determining a target rendering time corresponding to the current video frame according to the first rendering time and a preset rendering interval; and rendering the decoded data corresponding to the current video frame after the target rendering moment.

The controller 250 is further configured to: after the decoded data of the current video frame is obtained, the controller 250 obtains a first rendering time for rendering the decoded data of the previous video frame; determining a target rendering time corresponding to the current video frame according to the first rendering time and a preset rendering interval; before the target rendering moment, acquiring a first frame number corresponding to a cached decoded video frame; and if the first frame number is greater than or equal to the preset frame number, rendering the decoded data corresponding to the current video frame. And meanwhile, setting the preset rendering interval to be less than or equal to 2 times of the standard frame interval, and considering sound and picture synchronization.

In some embodiments, after the controller 250 determines the target rendering time corresponding to the current video frame according to the first rendering time and the preset rendering interval, before the target rendering time, if the first frame number of the buffered decoding data of the video frame is continuously smaller than the preset frame number, the controller 250 renders the decoding data corresponding to the current video frame after the target rendering time.

In some embodiments, the preset rendering interval is less than or equal to 2 times the standard frame interval.

illustratively, the terminal device 300 described above as shown in fig. 4 includes a controller 310, a communication interface 330, a user input/output interface 340, a memory, a power supply, and the like. The communication interface 330 is used for communicating with the outside, and includes at least one of a Wi-Fi chip, a bluetooth module, NFC, or an alternative module. The user input/output interface 340 includes at least one of a microphone, a speaker, a display screen, a sensor, a camera, or an alternative module.

When the terminal device 300 projects a screen to the display device 200, a video stream may be transmitted to the display device 200 through the terminal device 300 to display a video screen corresponding to the video stream in the display device 200. Specifically, the terminal device 300 may transmit the video stream to the display device 200 by transmitting the video data packet in real time.

The controller 310 in the terminal device 300 may control the communication interface 330 to send the target video data packet to the display device 200, so that the display device parses the target video data packet after receiving the target video data packet; and under the condition that the target video data packet comprises the target identification information, taking the target analysis result as analysis data of the current video frame.

Fig. 5 is a screen-casting play architecture diagram of a software configuration in the display device 200 according to an embodiment of the present disclosure. As shown in fig. 5, in some embodiments, the software configuration operating system of the display device 200 is mainly divided into five layers, which are an application layer, a Java interface layer, an implementation layer, a framework layer, and a hardware implementation layer from top to bottom. Wherein, the realization layer is a local service (Native) realization layer.

The application layer at least comprises an application program, and the application programs can be a window program, a system setting program or a clock program of an operating system; or an application developed by a third party developer. In particular implementations, the application packages in the application layer are not limited to the above examples. In the embodiment of the present disclosure, a screen projection application is included in the application layer, and screen projection data sent by other devices may be received through the screen projection application, where the screen projection data may include a video data packet.

The Java interface layer is provided with program interfaces, which can be used to receive video data packets from the application layer and then start to perform screen projection preparation operations, such as creating a player, setting a screen projection interface, setting a media data source, and the like. And passes error information to upper application layers and the like after performing these operations.

The implementation layer (i.e. Native implementation layer) is part of the common services and link libraries. The Native implementation layer can be implemented by C and C + + languages. These services of the Native implementation layer can communicate with the Java code of the upper layer and can also interact with the hardware drivers of the lower layer. The Native implementation layer may start video stream configuration after receiving the video data packet from the upper Java interface layer, and may process the received video data packet through the video stream processing module 501 and the synchronous rendering module 502 after starting the video stream configuration. The video stream processing module 501 parses after receiving a video data packet transmitted by an upper Java interface layer, and sends an obtained parsing result of the video frame to a decoder, after the decoder in the frame layer receives the parsing result of the video frame, the decoder can decode the parsing result of the video frame, and sends the decoded data of the video frame to the synchronous rendering module 502, and the synchronous rendering module 502 renders the decoded data of the video frame by using a codec synchronization mechanism (Avsync) in a digital television.

The Framework (Framework) layer, also called application Framework layer, mainly includes media stream codes, which can be used as a decoder to decode the parsing result of the received video frame. The Framework layer may receive the analysis results of the video frames sent by the video stream processing module 501 in the Native implementation layer, and send the decoded data of the video frames to the synchronous rendering module 502 after decoding the analysis results of the video frames. In case of a video frame decoding error, information of the decoding error may be sent to the Java interface layer, so that the Java interface layer forwards the error information to the application layer.

The hardware implementation layer comprises a plurality of hardware drivers, such as a display driver, a Bluetooth driver, an audio driver, a camera driver, a serial port driver and the like. The display driver in the embodiment of the disclosure is used for driving the display to display a screen projection picture.

For more detailed description of the present solution, fig. 6 is a flowchart of a video data transmission method provided in some embodiments of the present disclosure, where the steps involved in the actual implementation may include more steps or fewer steps, and the sequence between the steps may also be different, subject to the fact that the video data processing method provided in the embodiments of the present disclosure can be implemented.

As shown in fig. 6, a video data processing method provided in the embodiment of the present disclosure includes the following steps:

s601, the display device obtains a target video data packet to be analyzed, which is sent by the terminal device.

The target video data packet is any one of the video data packets transmitted in real time. The terminal equipment sends the target video data packet to be analyzed to the display equipment, and the corresponding terminal equipment sends the target video data packet to be analyzed.

In the embodiment of the disclosure, before the target video data packet to be analyzed is sent to the display device by the terminal device, a screen projection channel between the terminal device and the display device may be established. Specifically, after the terminal device receives a screen projection instruction for a certain video resource, the terminal device may search for a display device in the same Wi-Fi connection environment as the terminal device, and establish a screen projection channel with the display device, so as to transmit a video data packet corresponding to the video resource through the screen projection channel.

Fig. 7 is a schematic scene diagram of a video data transmission method according to an embodiment of the present disclosure.

For example, as shown in fig. 7, the display device 71 is a television, the terminal device 72 is a mobile phone, the control device 73 is a remote controller, the control device 73 may be configured to control the television to turn on a wireless screen projection function or turn off the wireless screen projection function, when the wireless screen projection function is turned on, after the terminal device 72 receives a screen projection instruction for a certain video resource, the terminal device 72 may establish a screen projection channel with the display device 71 based on the same Wi-Fi connection environment, and the terminal device 72 may transmit a video data packet corresponding to the video resource to the display device 71 through the screen projection channel.

S602, the display device judges whether the target video data packet comprises target identification information.

The target identification information is used for indicating that the target video data packet is the last data packet of the current video frame.

Before the terminal device sends the video data packet to the display device, each frame of video data corresponding to the video media asset may be encapsulated into a plurality of video data packets, that is, the video data of one video frame is encapsulated into a plurality of video data packets. When the video data of each video frame is encapsulated, the target identification information may be configured in the last video data packet corresponding to one video frame, the target identification information is not configured in other video data packets, and the one video data packet is indicated as the last video data packet in the one video frame by the target identification information.

In some embodiments, when the destination identification information is configured in the video data packet, the destination identification information may be configured in a header portion of the video data packet or in a payload portion of the video data packet.

When the target identification information is configured in the header portion of the video data packet, the target identification information may be configured in the header portion of the video data packet, and the target identification information may be indicated by any one or more bits in the header.

Illustratively, the target identification information is configured in the header portion of the video data packet by first bit position 1 of the 2 nd byte in the header portion of the video data packet, and correspondingly, if the target identification information is not configured in the header portion of the video data packet, the target identification information is configured in the first bit position 0 of the 2 nd byte in the header portion of the video data packet.

For example, it is assumed that the terminal device encapsulates one frame of video data into a video data packet a, a video data packet b, and a video data packet c, and sequentially transmits them to the display device. When encapsulating the video data packet a, the terminal device may put a first bit position 0 of a 2 nd byte in a packet header of the video data packet a; when encapsulating the video data packet b, the first bit position 0 of the 2 nd byte in the header of the video data packet b may be set; in encapsulating the video data packet c, the first bit position 1 of the 2 nd byte in the header of the video data packet c may be used.

Further, when the display device receives the video data packet a, it may be analyzed that the first bit of the 2 nd byte in the packet header of the video data packet a is 0, it is known that the target video data packet does not include the target identification information, and it is determined that the video data packet a is not the last video data packet of the current video frame; when the display device receives the video data packet b, the first bit of the 2 nd byte in the packet header of the video data packet b can be analyzed to be 0, the target video data packet is known not to include the target identification information, and the video data packet b is determined not to be the last video data packet of the current video frame; when the display device receives the video data packet c, it may be analyzed that a first bit of a 2 nd byte in a packet header of the video data packet c is 1, it is known that the target video data packet includes the target identification information, and it is determined that the video data packet c is a last video data packet of the current video frame, and then an analysis result of the video data packet a, an analysis result of the video data packet b, and an analysis result of the video data packet c may be combined to obtain an analysis result of the current video frame, that is, a target analysis result.

Under the condition that the target video data packet comprises the target identification information, the target video data packet is the last video data of the current video frame, and the following steps S603 and S604 are executed at the moment; under the condition that the target video data packet does not include the target identification information, it is indicated that the target video data packet is not the last video data packet of the current video frame, at this time, the parsing result of the target video data packet needs to be cached, and the subsequently received video data packet is continuously parsed, that is, the following step S605 is executed at this time.

Illustratively, the first bit of the 2 nd byte in the header of the target video data packet is 1, and it may be determined that the target video data packet includes target identification information, which indicates that the target video data packet is the last video data of the current video frame; the first bit of the 2 nd byte in the header of the target video data packet is 0, and it can be determined that the target video data packet does not include the target identification information, which indicates that the target video data packet is not the last video data of the current video frame.

S603, the display device obtains a target analysis result.

And S604, the display equipment takes the target analysis result as the analysis data of the current video frame.

The target analysis result refers to the analysis result of the current video frame, and the target analysis result includes the analysis results of all video data packets included between the first video data packet of the current video frame and the target video data packet.

Illustratively, video data of a current video frame is encapsulated into a video data packet a, a video data packet b, and a video data packet c, which are sequentially transmitted to the display device. The video data packet a is the first data packet of the current video frame, and the video data packet c is the last data packet of the current video frame, so that the analysis result of the video data packet a, the analysis result of the video data packet b, and the analysis result of the video data packet c are the analysis result of the current video frame, that is, the target analysis result.

The manner of determining the first video data packet of the current video frame may include, but is not limited to, the following two cases:

case 1: whether the video packet is the first packet of the current video frame is determined by whether a start code (start code) exists in the video packet.

The video data packet may be determined to be the first packet of the current video frame when a start code exists in the video data packet, and the video data packet may be determined not to be the first packet of the current video frame when the start code does not exist in the video data packet. Fig. 8 is a schematic diagram of determining a first video data packet of a video frame according to an embodiment of the present disclosure.

For example, as shown in fig. 8, assuming that a video frame is encapsulated into a video data packet a, a video data packet b, and a video data packet c, the video data packets are received in sequence at a display device, the video data packet a is parsed first, and if a start code is parsed from the video data packet a, it indicates that the video data packet a is a first data packet of the video frame, a parsing result 1 of the video data packet a is cached; then analyzing the video data packet b, if the start code is not analyzed from the video data packet b and the target identification information is not analyzed, indicating that the video data packet b is not the first data packet of the video frame or the last data packet of the video frame, and caching the analysis result 2 of the video data packet b; and then analyzing the video data packet c, if the start code is not analyzed from the video data packet c, but the target identification information is analyzed to obtain an analysis result 3, which indicates that the video data packet c is not the first data packet of the video frame but the last data packet of the video frame, and then the analysis result 1, the analysis result 2 and the analysis result 3 can be used as the target analysis result corresponding to the video frame.

Case 2: for a video data packet transmitted for the first time, the video data packet can be directly determined to be the first video data packet of a video frame; for a video data packet that is not transmitted for the first time, a video data packet transmitted after the last video data packet of the previous video frame may be determined as the first video data packet of the video frame.

Fig. 9 is a schematic diagram of another example of determining a first video data packet of a video frame according to an embodiment of the present disclosure.

Illustratively, as shown in fig. 9, it is assumed that the display device receives a video data packet d, a video data packet a, a video data packet b, and a video data packet c in this order. If the video data packet d comprises the target identification information in the process of analyzing the video data packet d, determining that the video data packet d is the last video data packet of the video frame where the video data packet d is located, and caching the analysis result 1 of the video data packet a; then analyzing the video data packet b, and caching an analysis result 2 of the video data packet b; and then analyzing the video data packet c, if the target identification information is analyzed from the video data packet c to obtain an analysis result 3, which indicates that the video data packet b is the last data packet of the video frame, and then the analysis result 1, the analysis result 2 and the analysis result 3 can be used as target analysis results corresponding to the video frame.

S605, the display device analyzes the target video data packet and caches an analysis result of the target video data packet.

After the step S605 is executed, the process returns to the step S601, acquires the next video packet, and continues the video data processing flow shown in fig. 6.

Compared with the prior art in which after a first data packet corresponding to a next frame of video data is received, a start code (start code) carried by the first data packet in the next frame of video data can be used to know that the data packet corresponding to the previous frame of video data has been completely parsed, so that the time for caching the video data can be reduced, and the screen projection delay is reduced.

Figure 10 is a second flowchart of the steps of a video data transmission method according to some embodiments of the present disclosure,

referring to fig. 10, another video data transmission method provided in the embodiment of the present disclosure includes:

s101, the display equipment acquires a target video data packet sent by the terminal equipment.

And S102, under the condition that the target video data packet comprises the target identification information, taking the target analysis result as the analysis data of the current video frame.

For the descriptions of S101 to S102, reference may be made to the descriptions of S601 to S604, which are not described herein again.

And S103, after the analysis data of the previous video frame is decoded, decoding the analysis data of the current video to obtain the decoding data corresponding to the current video frame.

In some embodiments, before the decoder performs decoding, the decoding mode of the decoder may be set to a frame-in-frame-out mode in advance.

In the embodiment of the present disclosure, the decoding mode of the decoder may include various modes. Exemplary may include, but are not limited to, the following two ways:

(1) A frame-in-frame-out mode;

the mode of one frame in one frame out means that one frame of video data is input into the decoder each time, and the next frame of video data is input into the decoder after the decoded data of the frame of video data is output, so that the analytic data of the current video frame input into the decoder can be decoded in time without waiting for too long time.

Here, the one-frame video data refers to parsing data of one video frame.

Fig. 11 is a schematic diagram of a decoding method provided in the embodiment of the present disclosure.

For example, as shown in fig. 11 (a), after the 1 st frame of video data is input to the decoder, the decoder decodes and outputs the decoded data of the first frame of video data, and after the decoder outputs the decoded data of the first frame of video data, the 2 nd frame of video data is input to the decoder for decoding.

(2) The way of multiple frames going into one frame going out.

The mode of multi-frame in-one frame out refers to inputting multi-frame video data to a decoder once, waiting for the decoder to decode in sequence and outputting corresponding decoded data.

Illustratively, as shown in fig. 11 (b), the 1 st frame video data, the 2 nd frame video data, the 3 rd frame video data, and the 4 th frame video data may be input to a decoder together, and the decoder may sequentially decode these data and output frame by frame.

In the above embodiment, when the decoding manner of the decoder is configured as a frame-in-frame-out manner, compared with the configuration of the decoding manner as a multi-frame-in-frame-out manner, the problem that the decoding rate of the decoder is affected due to more data to be decoded buffered in the decoder can be avoided, so that the decoding time is too long, and the data input into the decoder can be decoded in time without waiting for too long time.

Fig. 12 is a flowchart illustrating a third step of a video data transmission method according to some embodiments of the disclosure. Referring to fig. 12, the video data transmission method includes the steps of:

s121, the display equipment acquires the target video data packet sent by the terminal equipment.

And S122, under the condition that the target video data packet comprises the target identification information, taking the target analysis result as the analysis data of the current video frame.

And S123, after the analysis data of the previous video frame is decoded, decoding the analysis data of the current video to obtain the decoded data of the current video frame.

For the descriptions of S121 to S123, reference may be made to the descriptions of S601 to S605, which are not described herein again.

And S124, acquiring a first rendering moment when the decoded data corresponding to the previous video frame is rendered.

And the first rendering moment is the moment when the decoded data corresponding to the last video frame is rendered. And after the decoding data of the current video frame is acquired, acquiring a first rendering moment when the decoding data corresponding to the previous video frame is rendered.

And S125, determining a target rendering time corresponding to the current video frame according to the first rendering time and the preset rendering interval.

The preset rendering interval is the interval duration between the rendering time of the previous frame of video data and the rendering time of the current frame of video data.

The target rendering time is determined after a preset video rendering interval after the first rendering time.

Exemplarily, fig. 13 is provided for the embodiment of the present disclosure, as shown in fig. 13, where the first rendering time is set to T1, the preset video rendering interval is set to Δ T, and the target rendering time is set to T2, then T2= T1+ Δ T.

And S126, rendering the decoded data of the current video frame after the target rendering time.

In some embodiments, rendering the decoded data corresponding to the current video frame after the target rendering time may include, but is not limited to: and acquiring the current system time, and performing video rendering on the decoded data corresponding to the current video frame when the current system time is behind the target rendering time.

In some embodiments, when the current system time is before the target rendering time, the decoding data corresponding to the current video frame may be rendered after the target rendering time.

Fig. 14 is a third flowchart illustrating steps of a video data transmission method according to an embodiment of the disclosure.

In conjunction with fig. 12 described above, as shown in fig. 14, step S126 in fig. 12 described above may be replaced with steps S126a and S126b described below.

S126a, before the target rendering time, acquiring a first frame number corresponding to the cached decoded video frame.

After the decoded data corresponding to the video frames are decoded and put into the cache, the number of the decoded video frames stored in the cache is checked to determine the first frame number.

For example, the video data of one decoded video frame stored in the buffer indicates that the first frame number is 1 at this time.

And S126b, if the first frame number is greater than or equal to the preset frame number, rendering the decoded data of the current video frame.

In the embodiment of the present disclosure, the current system time may be obtained, and the above S126a and S126b may be executed before the target rendering time.

The preset frame number can be set based on the maximum frame number allowed to be stored in the cache, and is smaller than the maximum frame number allowed to be stored in the cache.

For example, assuming that the maximum number of frames that the cache is allowed to store is 9, n may be set to any positive integer less than 9.

In the above embodiment, the preset frame number is smaller than the maximum frame number allowed to be stored in the buffer, so that the decoded data corresponding to the video frame can be rendered in time before the decoded data corresponding to the video frame is fully stored in the buffer, and the decoded data corresponding to the subsequently decoded video frame is stored in the buffer in an empty state.

In some embodiments, the preset number of frames may be set to 1. And when the first frame number is greater than or equal to 1 frame, rendering the decoded data corresponding to the current video frame. Therefore, the decoding data corresponding to the previous video frame can be rendered after the decoding data corresponding to the new video frame is stored in the cache, so that the video can be effectively rendered in time, and the decoding data corresponding to the video frame can be displayed in time.

In some embodiments, the preset rendering interval may be less than or equal to m times the standard frame interval. For example, the value of m in m times the standard frame interval may be between 1 and 2.

Generally, when the time interval between the rendering moments of the decoded data corresponding to two video frames (i.e., the preset rendering interval) is less than or equal to 2 times of the frame interval, it may be ensured that the audio and video of the projected picture and the audio data are synchronized, and if the preset rendering interval is greater than 2 times of the frame interval, the audio and video may be unsynchronized. That is, the 2-fold frame interval may be a maximum value of a preset rendering interval in a case where the sound-picture synchronization is guaranteed.

Illustratively, in the embodiment of the present disclosure, the value of m may be set to 2.

Fig. 15 is a schematic view of another scene of a video data processing method according to some embodiments of the disclosure.

Referring to fig. 15, in the screen shot scene: the display device 151 is a television, and the terminal device 152 is a PC terminal or a tablet. The PC or the tablet transmits the video data packet to the tv in real time, after the tv receives the video data packet transmitted from the tablet, the tv may perform the processing such as parsing, decoding, and rendering on the video data packet according to the video data processing method provided in fig. 10 or fig. 12, and after the rendering is completed, the tv may synchronously display the screen-projected image of the PC or the tablet.

The video data processing method provided in the embodiment of the disclosure achieves different effects of reducing screen projection delay for different screen projection devices.

Illustratively, the screen projection device is a mobile phone end, when the frame rate of the mobile phone end is 60fps, the screen projection delay is 150ms before the video data processing method provided by the embodiment of the disclosure is used, and the screen projection delay is reduced to 80ms after the video data processing method provided by the embodiment of the disclosure is used; the screen projection equipment is a PC end, when the frame rate of the PC end is 60fps, the screen projection delay is 240ms before the video data processing method provided by the embodiment of the disclosure is used, and the screen projection delay is reduced to 100ms after the video data processing method provided by the embodiment of the disclosure is used.

In some embodiments, an embodiment of the present disclosure further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process executed by the video data processing method, and can achieve the same technical effect, and in order to avoid repetition, the computer program is not described herein again.

The computer readable storage medium may be ROM, RAM, magnetic or optical disk, etc.

In some embodiments, the present disclosure also provides a computer program product including a computer program which, when run on a computer, causes the computer to implement the video data processing method described above.

It should be noted that the terms "comprises," "comprising," or any other variation thereof in the description and claims of this disclosure and the above-described drawings are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one of 8230, and" comprising 8230does not exclude the presence of additional like elements in a process, method, article, or apparatus comprising the element.

The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the foregoing discussion in some embodiments is not intended to be exhaustive or to limit the implementations to the precise forms disclosed above. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles and the practical application, to thereby enable others skilled in the art to best utilize the embodiments and various embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. A display device, comprising:

under the condition that the target video data packet comprises target identification information, taking a target analysis result as analysis data of a current video frame;

2. The display device of claim 1, wherein the controller is specifically configured to:

taking a target analysis result as analysis data of the current video frame under the condition that the target video data packet comprises target identification information, wherein the method comprises the following steps:

and under the condition that the first bit of the 2 nd byte in the packet header of the target video data packet is 1, determining that the packet header of the target video data packet comprises target identification information, and taking a target analysis result as analysis data of the current video frame.

3. The display device of claim 1, wherein the controller is further configured to:

after the target analysis result is used as the analysis data of the current video frame, and the controller decodes the analysis data of the current video after the analysis data corresponding to the previous video frame is decoded, so as to obtain the decoded data of the current video frame.

4. The display device according to claim 3, wherein the controller is further configured to:

before the decoder decodes, the decoding mode of the decoder is set to be a mode of one frame in one frame out.

5. The display device of claim 1, wherein the controller is further configured to:

after the decoding data of the current video frame is acquired, acquiring a first rendering moment when the decoding data of the previous video frame is rendered;

determining the target rendering time of the current video frame according to the first rendering time and a preset rendering interval;

and rendering the decoded data of the current video frame after the target rendering moment.

6. The display device of claim 1, wherein the controller is further configured to:

after the decoded data of the current video frame is acquired, the controller acquires a first rendering moment for rendering the decoded data of the previous video frame;

determining a target rendering time corresponding to the current video frame according to the first rendering time and a preset rendering interval;

before the target rendering moment, acquiring a first frame number corresponding to the cached decoded video frame;

and if the first frame number is greater than or equal to the preset frame number, rendering the decoded data corresponding to the current video frame.

7. The display device of claim 6, wherein the controller is further configured to:

and after determining a target rendering time corresponding to the current video frame according to the first rendering time and a preset rendering interval, if the first frame number is continuously smaller than a preset frame number before the target rendering time, rendering the decoded data corresponding to the current video frame after the target rendering time.

8. The display device according to any of claims 5 to 7, wherein the preset rendering interval is less than or equal to 2 times a standard frame interval.

9. A method of processing video data, comprising:

10. The method according to claim 9, wherein the taking the target parsing result as the parsing data of the current video frame in case that the target identification information is included in the target video data packet comprises: