CN117425023A - Video data processing method and device, electronic equipment and storage medium - Google Patents

Video data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117425023A
CN117425023A CN202311235420.3A CN202311235420A CN117425023A CN 117425023 A CN117425023 A CN 117425023A CN 202311235420 A CN202311235420 A CN 202311235420A CN 117425023 A CN117425023 A CN 117425023A
Authority
CN
China
Prior art keywords
video data
data
format
module
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311235420.3A
Other languages
Chinese (zh)
Inventor
雷长鸣
崔斌斌
王艳辉
杨春晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Visionvera Information Technology Co Ltd
Original Assignee
Visionvera Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Visionvera Information Technology Co Ltd filed Critical Visionvera Information Technology Co Ltd
Priority to CN202311235420.3A priority Critical patent/CN117425023A/en
Publication of CN117425023A publication Critical patent/CN117425023A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4

Abstract

The embodiment of the invention provides a processing method, a device, electronic equipment and a storage medium of video data, and relates to the technical field of video data processing. Responding to the received video data through the detection module, and if the video data meet the preset condition, transmitting the video data to the encoder; encoding, by the encoder, the video data in response to the received video data, obtaining encoded data corresponding to the video data; and transmitting the coded data to a corresponding receiving end through the network port.

Description

Video data processing method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of video data processing technologies, and in particular, to a video data processing method, a video data processing apparatus, an electronic device, and a computer readable storage medium.
Background
With the development of network technology, video conferences, video teaching and the like have become important ways for people to work and learn life in daily life. In the processes of video conference, video teaching and the like, video data needs to be processed, and for corresponding terminals, ARM (Advanced RISC Machines, advanced reduced instruction set processor) is often adopted to collect video data, and codec processing and the like are performed. For an ARM chip, the quantization depth of a color channel supported by the ARM chip can only reach 8 bits, and the format only supports YUV420 (pixel format), so that a great deal of data details are lost relative to an original video. In addition, the encoding and decoding of the video data adopts software algorithm encoding, the time required from data input to the completion of encoding is long, and the decoding time is added, so that more time is consumed in the whole processing process of the video data, and the real-time performance of video data transmission is easily affected.
Disclosure of Invention
The embodiment of the invention provides a processing method, a processing device, electronic equipment and a computer readable storage medium for video data, which are used for solving or partially solving the problem of high delay in the video data transmission process.
The embodiment of the invention discloses a processing method of video data, which is applied to terminal equipment, wherein the terminal equipment is configured with an FPGA device and a network port, the FPGA device at least comprises a detection module and an encoder which is in communication connection with the detection module, and the method comprises the following steps:
Responding to the received video data through the detection module, and if the video data meet the preset condition, transmitting the video data to the encoder;
encoding, by the encoder, the video data in response to the received video data, obtaining encoded data corresponding to the video data;
and transmitting the coded data to a corresponding receiving end through the network port.
Optionally, if the video data meets a preset condition, transmitting the video data to the encoder includes:
and if the time length of the received video data reaches the preset time length threshold value, transmitting the video data to the encoder.
Optionally, the detection module includes a plurality of data conversion units, and before the video data is transmitted to the encoder, the method further includes:
acquiring a current data format of the video data through the data conversion unit;
if the current data format is different from the target data format, converting the video data into the target data format through the data conversion unit;
the target data format at least comprises a playing format corresponding to the receiving end.
Optionally, the data format includes at least one of a color coding format and a resolution, the data conversion unit includes at least a color resampling module, a color space conversion module, and an image resolution conversion module, and if the current data format is different from a target data format, converting the video data into the target data format by the data conversion unit includes:
if the current color coding format of the video data is not the target color coding format, converting the video data into the target color coding format through the color resampling module and the color space conversion unit;
and/or if the current resolution of the video data is not the target resolution, converting the video data into the target resolution through the image resolution conversion module.
Optionally, the target color coding format is a YUV422 coding format, and if the current color coding format of the video data is not the target color coding format, converting, by the color resampling module and the color space conversion unit, the video data into the target color coding format includes:
If the video data is in the RGB coding format, converting the video data into the YUV coding format through the color space conversion module, and converting the converted video data into the YUV422 coding format through the color resampling module;
if the video data is in YUV444 encoding format or YUV420 encoding format, converting the video data into YUV422 encoding format by the color resampling module.
Optionally, the encoder at least includes an image reordering module, a video pixel encoding module, an entropy encoding module, and an encoded data buffering module, where the encoding, by the encoder, of the video data in response to the received video data, to obtain encoded data corresponding to the video data includes:
dividing the video data into a plurality of video sequences by the image reordering module, wherein the video sequences comprise a plurality of image groups;
converting pixel data in each image group into compressed format image data by the video pixel coding module, wherein the image data is marked by a symbol;
and compressing the image data according to the probability of the symbol occurrence by the entropy coding module to obtain coding data corresponding to the video data.
Optionally, the FPGA device further includes an acquisition module communicatively connected to the detection module, and before the receiving of the video data by the detection module, the method further includes:
and receiving video data sent by the acquisition equipment through the acquisition module, and transmitting the video data to the detection module.
The embodiment of the invention also discloses a processing device of the video data, which is configured with an FPGA device and a network port, wherein the FPGA device at least comprises a detection module and an encoder which is in communication connection with the detection module; wherein,
the detection module is used for responding to received video data, and if the video data meets preset conditions, the video data is transmitted to the encoder;
the encoder is used for responding to the received video data, encoding the video data and obtaining encoded data corresponding to the video data;
the network port is configured to send the encoded data to a corresponding receiving end.
Optionally, the detection module is specifically configured to:
and if the time length of the received video data reaches the preset time length threshold value, transmitting the video data to the encoder.
Optionally, the detection module includes a plurality of data conversion units, and the detection module is specifically further configured to:
acquiring a current data format of the video data through the data conversion unit;
if the current data format is different from the target data format, converting the video data into the target data format through the data conversion unit;
the target data format at least comprises a playing format corresponding to the receiving end.
Optionally, the data format includes at least one of a color coding format and a resolution, the data conversion unit includes at least a color resampling module, a color space conversion module, and an image resolution conversion module, and the detection module is specifically further configured to:
if the current color coding format of the video data is not the target color coding format, converting the video data into the target color coding format through the color resampling module and the color space conversion unit;
and/or if the current resolution of the video data is not the target resolution, converting the video data into the target resolution through the image resolution conversion module.
Optionally, the target color coding format is a YUV422 coding format, and the detection module is specifically further configured to:
if the video data is in the RGB coding format, converting the video data into the YUV coding format through the color space conversion module, and converting the converted video data into the YUV422 coding format through the color resampling module;
if the video data is in YUV444 encoding format or YUV420 encoding format, converting the video data into YUV422 encoding format by the color resampling module.
Optionally, the encoder at least includes an image reordering module, a video pixel encoding module, an entropy encoding module, and an encoded data buffering module, and the encoder is specifically configured to:
dividing the video data into a plurality of video sequences by the image reordering module, wherein the video sequences comprise a plurality of image groups;
converting pixel data in each image group into compressed format image data by the video pixel coding module, wherein the image data is marked by a symbol;
and compressing the image data according to the probability of the symbol occurrence by the entropy coding module to obtain coding data corresponding to the video data.
Optionally, the FPGA device further includes an acquisition module communicatively connected to the detection module, and the apparatus further includes:
the acquisition module is used for receiving the video data sent by the acquisition device and transmitting the video data to the detection module.
The embodiment of the invention also discloses electronic equipment, which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
the memory is used for storing a computer program;
the processor is configured to implement the method according to the embodiment of the present invention when executing the program stored in the memory.
Embodiments of the present invention also disclose a computer-readable storage medium having instructions stored thereon, which when executed by one or more processors, cause the processors to perform the method according to the embodiments of the present invention.
The embodiment of the invention has the following advantages:
the terminal device at least comprises a detection module and an encoder which is in communication connection with the detection module, the terminal can detect the video data in real time through the detection module, when the terminal device receives the video data, the detection module can respond to the received video data to detect the received video data, and send the received video data to the encoder when the received video data meets the preset condition, the encoder encodes the video data after receiving the video data to obtain corresponding encoded data, and then the terminal device can send the encoded data to a corresponding receiving end through the network port, so that the data format can be effectively enriched in the processing process of the video data based on the characteristics of the FPGA device, and the encoding of the video data is carried out when the video data is detected to meet the condition.
Drawings
Fig. 1 is a flowchart of steps of a method for processing video data according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a part of a structure of a terminal device provided in an embodiment of the present invention;
FIG. 3 is a schematic diagram of a communication architecture provided in an embodiment of the present invention;
FIG. 4 is a schematic diagram of a communication architecture provided in an embodiment of the present invention;
fig. 5 is a block diagram showing a configuration of a processing apparatus for video data provided in an embodiment of the present invention;
fig. 6 is a block diagram of an electronic device provided in an embodiment of the invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
In order to enable those skilled in the art to better understand the technical solutions in the embodiments of the present invention, the following explains and describes some technical features related to the embodiments of the present invention:
FPGA (Field Programmable Gate Array) A field programmable gate array, which is a semi-custom circuit in the field of Application Specific Integrated Circuits (ASICs), overcomes the disadvantages of custom circuits and the limitation of the number of gates of the original programmable devices.
H.265, also known as high efficiency video coding or HEVC, is an abbreviation for High Efficiency Video Coding. Is the latest standard in a range of video compression standards. Belongs to a prediction and transformation mixed frame code. The h.265 video compression codec is a successor to h.264. It builds on a similar concept as the precursor, but is becoming popular due to the ubiquitous rapid adoption of 4K content. H.265 allows video compression at half the bit rate without affecting video quality, doubling its theoretical efficiency. H.265 provides significantly improved video quality when H.265 is compressed to the same bit rate as H.264.
The high definition multimedia interface (High Definition Multimedia Interface, HDMI) is a fully digital video and audio transmission interface, which can transmit uncompressed audio and video signals, and HDMI can be used for devices such as set top boxes, DVD players, personal computers, televisions, game consoles, combination expanders, digital sound and television sets, etc. HDMI can send audio frequency and video signal simultaneously, because audio frequency and video signal adopt same wire rod, simplify the installation degree of difficulty of system's circuit greatly.
As an example, in the transmission process of video data, if the encoding process of the video data takes too much time, the overall transmission duration of the video data is longer, the delay of data transmission is improved, and the real-time performance of the video data is reduced. For example, for terminal equipment adopting an ARM chip, on one hand, the color channel quantization depth supported by the ARM chip can only reach 8 bits, the coding format can only support YUV420, and a great deal of data details are lost relative to the original video. Meanwhile, the video data is encoded by adopting a software algorithm, the time from data input to the completion of encoding processing is long, and the whole process takes 5 video frames, so that the delay time is longer after the encoded data is transmitted through a network and then decoded and displayed. Based on the problems, corresponding requirements cannot be met in application scenes displayed in real time, such as video conferences, emergency command, control centers and the like.
In the invention, the FPGA device and the network port are configured in the terminal equipment, wherein the FPGA device at least comprises a detection module and an encoder which is in communication connection with the detection module, the terminal can detect video data in real time through the detection module, when the terminal equipment receives the video data, the detection module can respond to the received video data to detect the received video data, and send the received video data to the encoder when the received video data meets the preset condition, the encoder encodes the video data after receiving the video data to obtain corresponding encoded data, and then the terminal equipment can send the encoded data to the corresponding receiving end through the network port, so that the data format can be effectively enriched in the processing process of the video data based on the characteristics of the FPGA device, and the encoding of the video data is carried out when the condition of the video data is detected, and the process of 'receiving and encoding at the same time' is basically realized through the mutual coordination among a plurality of modules, thereby improving the encoding efficiency of the video data and further reducing the delay of the video data transmission process and guaranteeing the real-time performance of the video data transmission.
Referring to fig. 1, a step flow chart of a video data processing method provided in an embodiment of the present invention is shown and applied to a terminal device, where the terminal device is configured with an FPGA device and a network port, and the FPGA device includes at least a detection module and an encoder communicatively connected to the detection module, and specifically may include the following steps:
step 101, responding to received video data through the detection module, and if the video data meets a preset condition, transmitting the video data to the encoder;
optionally, the FPGA device has the characteristics of programmable and data parallel processing, and can perform parallel processing after serial-parallel conversion for high-speed serial interface (such as HDMI) data, based on which the problem of limited data quantization bit width when the ARM chip processes video data can be effectively solved. And, can support the 4K P60, the receiving of the YUV444 format, and support YUV422/10bit format encoding and decoding. Compared with ARM which only supports YUV420/8bit data format, the data bandwidth is increased by about 60%, and the video image quality can be ensured not to be affected.
In a possible implementation manner, referring to fig. 2, a schematic diagram of a part of a structure of a terminal device provided in an embodiment of the present invention is shown, where the terminal device may include an FPGA device, a network port, and a memory, where the FPGA device at least includes an acquisition module, a detection module, an encoder, and the like, and in a processing process of video data, the terminal device may be communicatively connected to a corresponding acquisition device (such as an image capturing device) through an HDMI interface, and receive video data transmitted by the acquisition device through the HDMI interface, and specifically, the terminal device may receive the video data through the acquisition module in the FPGA device, perform corresponding data format conversion, then transmit the converted video data to the detection module to perform integrity detection, and transmit the video data to the encoder to obtain corresponding encoded data when detecting that a condition is satisfied, and then the terminal device may send the encoded data to a corresponding receiving end through the network port, so that the receiving end decodes the encoded data and displays corresponding video content. In the above process, the terminal device may store the received video data, the encoded data generated after encoding, and the like into the memory.
In one example, the terminal device may be a video conference terminal, in a video conference, the video conference terminal receives a video image collected by a camera through an HDMI interface, performs compression encoding after receiving the video image and performing video processing, and transmits the video image to another video conference terminal through a network interface. In the process, the FPGA device is positioned on a core chip on the video conference terminal and is responsible for video format receiving, image processing, encoding and other works.
In a specific implementation, after receiving video data sent by an acquisition device, a terminal device can receive the video data through an acquisition module in an FPGA device, and perform corresponding processing on the video data based on a parallel processing mode of the FPGA device, and then the processed video data is transmitted to a detection module, so that the detection module detects the integrity of the video data, and judges whether to start video coding.
In some possible implementations, the detection module may include a plurality of data conversion units, so that the terminal device may acquire a current data format of the video data through each data conversion unit, then compare the current data format with a corresponding target data format, and if the current data format is different from the target data format, convert the video data into the target data format through the data conversion unit; if the current data format is the target data format, the data format is directly stored in the corresponding memory without format conversion, so that different scene requirements can be met by converting the format of the video data, and meanwhile, the data format is enriched based on the characteristics of the FPGA. The target data format may at least include a play format corresponding to the receiving end, a play format preset by the terminal device, and the like.
The data format at least comprises one of a color coding format and a resolution, the data conversion unit at least comprises a color resampling module, a color space conversion module and an image resolution conversion module, the conversion process of the data format can comprise the steps of comparing the color coding format, the resolution and the like of video data with corresponding target color coding formats, target resolution and the like respectively, judging whether the video data received by the terminal equipment meets the requirements or not, and particularly, if the current color coding format of the video data is not the target color coding format, converting the video data into the target color coding format through the color resampling module and the color space conversion unit; and/or if the current resolution of the video data is not the target resolution, converting the video data into the target resolution through the image resolution conversion module. Alternatively, one of the color coding format or resolution may be converted, or both may be converted, which is not limited by the present invention.
Note that, the color coding format may include RGB coding format (Red, green, and Blue), YUV coding format (luminance (Y) and chrominance (UV) of an image), and the like, which are generally YUV coding format for video data, and it is necessary to convert the video data into YUV coding format. In addition, in the YUV coding format, a YUV420 coding format, a YUV422 coding format, a YUV444 coding format and the like can be further divided, and in the actual coding process, the video data is converted into the YUV422 coding format, so that the data of the color image can be compressed more efficiently, and meanwhile, the better image quality is maintained.
For the color resampling module, it can convert YUV444 encoding format, YUV420 encoding format, etc., for example, can convert the two into YUV422 encoding format, etc.; a color space conversion module, which may be responsible for converting the RGB encoding format to a YUV encoding format; the image resolution conversion module may be configured to convert the resolution of the input video data, so that the input video data may be encoded according to a uniform resolution for different input resolutions, for example, if the resolution of the input video data is 1080p, the input video data may be converted into encoded data with resolutions of 720p, 2k, 4k, etc., which is not limited in this aspect of the present invention.
In one example, assuming that the target color encoding format is a YUV422 encoding format, if the video data is an RGB encoding format, converting the video data into a YUV encoding format by a color space conversion module, and converting the converted video data into a YUV422 encoding format by a color resampling module; if the video data is in YUV444 coding format or YUV420 coding format, the video data is converted into YUV422 coding format by the color resampling module, so that when the video data is detected to be in RGB coding format, the video data can be converted into corresponding YUV coding format and then converted into YUV422 coding format, the video data can be compressed better, and meanwhile, the better image quality can be maintained.
In addition, the video data can be a data stream formed by a series of video frames, the detection module can immediately detect the integrity of the video data after receiving the video data transmitted by the acquisition module, judge whether the currently received video data meets the coding condition or not, and if the duration of the received video data is determined to reach the preset duration threshold, the video frames are transmitted to the encoder through the detection module, so that the video data is continuously coded while the video data is continuously received based on the characteristic of parallel processing of the FPGA, and the video data coding efficiency is improved.
For example, the preset duration threshold may be one-eighth of a frame, and the detection module may continuously transmit the currently received video data (one-eighth of a video frame) to the encoder when detecting that the duration of the received video data reaches one-eighth of a frame (i.e., one-eighth of a video frame is received), so that the encoder can perform encoding synchronously.
102, responding to the received video data by the encoder, and encoding the video data to obtain encoded data corresponding to the video data;
For the encoder, the encoder may include an image reordering module, a video pixel encoding module, an entropy encoding module, an encoded data buffering module, and the like, after receiving video data, the encoder may divide the video data into a plurality of video sequences through the image reordering module, the video sequences include a plurality of image groups, then the video pixel encoding module converts the pixel data in each image group into image data in a compressed format, the image data is data identified by a symbol, then the entropy encoding module compresses the image data according to the probability of occurrence of the symbol, so as to obtain encoded data corresponding to the video data, and during encoding, as the video data is input, the plurality of modules may encode in parallel, thereby being based on a pipeline data mode
In some possible implementations, the image reordering module (Picture Reordering): in video compression, video frames are typically encoded and transmitted in temporal order. However, in compression algorithms, in order to maximize compression efficiency, the video frames may be reordered to better utilize the characteristics of the encoder. The image reordering module is responsible for adjusting the order of video frames so that the encoder can better utilize the inter-frame correlation, thereby achieving a higher compression effect. Pixel Encoding (Pixel Encoding): pixel encoding is the process of converting pixel data in an original video frame into a compressed format. This step typically involves color space conversion, sample rate reduction, etc., in order to reduce the amount of data and improve compression efficiency. Common pixel coding standards include H.264/AVC, H.265/HEVC, and the like. Entropy coding (Entropy Encoding): after pixel encoding, entropy encoding techniques are used to further compress the video data. Entropy coding allocates shorter codewords according to the probability of occurrence of different pixel values, and the most common entropy coding algorithms are huffman coding and arithmetic coding. This step can further compress the data, reducing the bandwidth or storage space required for data transmission. Coded data buffering (bitstreambuffer): the coded data buffer is a buffer for storing compressed video data. The compressed code stream generated during video encoding is stored in an encoded data buffer awaiting transmission or writing to a storage medium. The size of the coded data buffer may affect the smoothness of video playing, so that reasonable settings are required according to the requirements and performances of the system.
And step 103, transmitting the encoded data to a corresponding receiving end through the network port.
In the process of processing video data, a corresponding memory can be configured in the terminal device and used for storing received video data and storing encoded data, and meanwhile, after the encoded data is obtained, the encoded data can be sent to a corresponding receiving end through a network port so that the receiving end decodes the encoded data and displays corresponding video content.
It should be noted that the embodiments of the present invention include, but are not limited to, the foregoing examples, and it will be understood that those skilled in the art may also set the embodiments according to actual requirements under the guidance of the concepts of the embodiments of the present invention, which are not limited thereto.
The terminal device at least comprises a detection module and an encoder which is in communication connection with the detection module, the terminal can detect the video data in real time through the detection module, when the terminal device receives the video data, the detection module can respond to the received video data to detect the received video data, and send the received video data to the encoder when the received video data meets the preset condition, the encoder encodes the video data after receiving the video data to obtain corresponding encoded data, and then the terminal device can send the encoded data to a corresponding receiving end through the network port, so that the data format can be effectively enriched in the processing process of the video data based on the characteristics of the FPGA device, and the encoding of the video data is carried out when the video data is detected to meet the condition.
In order to better understand the technical solutions of the embodiments of the present invention, the following is an exemplary description by an example:
referring to fig. 3, a schematic diagram of a communication architecture provided in an embodiment of the present invention is shown, which relates to a process of video acquisition, detection and encoding, specifically:
stage1 is HDMI_IN, HDMI_IN is a video acquisition module, and can receive various data formats such as YUV, RGB and the like and data quantization width (from 8bit to 16 bit) based on a parallel processing mode of FPGA, and the resolution is supported to 4KP60 at maximum. While the video data is converted into YUV422/10bit format for encoding.
Stage2 is frame_detect and is a video detection module for detecting the integrity of the input video. The traditional coding flow is to start coding after the whole video frame is cached, and the whole processing flow has long delay time. The input video is monitored by adopting the video detection module, the coding function can be started after the video input frame meets the coding data requirement, and the video input frame is usually input by one eighth of frames, and the subsequent video coding module can be started without waiting for the complete video frame to be stored in the DDR, so that the delay of the whole processing flow is reduced.
Stage3 is a VIDEO encoding device which reads the YUV stored IN VIDEO IN and ENCODEs the VIDEO containing I frame and P frame according to a certain strategy, for example, according to parameter setting; or include I frames, P frames, B frames (key frames, forward predictive encoded frames, bi-directional predictive encoded frames, respectively); or only the I frame is contained to encode the DATA, and the encoded DATA is stored in the DDR (memory) through the DATA BUS (DATA BUS, DATA transmission channel between components).
Stage4 is a NET (network port) and transmits the code stream output by the code to the network.
IN addition, referring to fig. 4, a schematic diagram of a communication architecture provided IN an embodiment of the present invention is shown, where the hdmi_in block diagram mainly includes a video processing sub-module, a sample resampling, a CSC color space conversion, a scaler image resolution conversion: the Resample resampling module mainly converts video formats such as YUV444 and YUV420, and uniformly converts the input YUV444 format and YUV420 format into YUV422 format; the CSC module is mainly responsible for converting RGB format into YUV format; the Scaler video scaling module converts the resolution of the input video, and can perform subsequent encoding according to the unified resolution for different input resolutions. In the Encoding block diagram, the front-end processed video is encoded, and an h.265 Encoding algorithm can be implemented by adopting FPGA hardware, specifically, the Encoding block diagram can include the following submodule, picture Reordering image reordering module, pixel Encoding video Pixel Encoding, encoding Entropy Encoding, bitstream Buffer Encoding data buffering, and the corresponding processing procedures can be: the image sorting firstly divides a video into a plurality of sequences, one sequence is divided into a plurality of image groups, the FPGA can encode each group simultaneously, and the entropy encoding refers to encoding without losing any information according to the entropy principle in the encoding process. Quantization is a lossy compression method, whereas entropy coding marks the mapping relationship between the original data in a more compact way, belonging to lossless compression. The FPGA hardware is utilized to independently perform calculation and a pipeline data mode among a plurality of modules, and the pipeline data mode is that after data passes through a processing module, a period of delay is generated on output, but as long as the modules start to output, the output is not interrupted under the condition that the input is not interrupted. The video input is adopted in strategy to start the coding mode, so that the low-delay coding function is realized.
It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.
Referring to fig. 5, there is shown a block diagram of a video data processing apparatus provided in an embodiment of the present invention, where the apparatus is configured with an FPGA device and a network port, and the FPGA device includes at least a detection module and an encoder communicatively connected to the detection module; wherein,
the detection module is used for responding to received video data, and if the video data meets preset conditions, the video data is transmitted to the encoder;
the encoder is used for responding to the received video data, encoding the video data and obtaining encoded data corresponding to the video data;
The network port is configured to send the encoded data to a corresponding receiving end.
In some possible implementations, the detection module is specifically configured to:
and if the time length of the received video data reaches the preset time length threshold value, transmitting the video data to the encoder.
In some possible implementations, the detection module includes a plurality of data conversion units, and the detection module is specifically further configured to:
acquiring a current data format of the video data through the data conversion unit;
if the current data format is different from the target data format, converting the video data into the target data format through the data conversion unit;
the target data format at least comprises a playing format corresponding to the receiving end.
In some possible implementations, the data format includes at least one of a color coding format and a resolution, the data conversion unit includes at least a color resampling module, a color space conversion module, and an image resolution conversion module, and the detection module is specifically further configured to:
if the current color coding format of the video data is not the target color coding format, converting the video data into the target color coding format through the color resampling module and the color space conversion unit;
And/or if the current resolution of the video data is not the target resolution, converting the video data into the target resolution through the image resolution conversion module.
In some possible implementations, the target color coding format is a YUV422 coding format, and the detection module is specifically further configured to:
if the video data is in the RGB coding format, converting the video data into the YUV coding format through the color space conversion module, and converting the converted video data into the YUV422 coding format through the color resampling module;
if the video data is in YUV444 encoding format or YUV420 encoding format, converting the video data into YUV422 encoding format by the color resampling module.
In some possible implementations, the encoder includes at least an image reordering module, a video pixel encoding module, an entropy encoding module, and an encoded data buffering module, the encoder being specifically configured to:
dividing the video data into a plurality of video sequences by the image reordering module, wherein the video sequences comprise a plurality of image groups;
converting pixel data in each image group into compressed format image data by the video pixel coding module, wherein the image data is marked by a symbol;
And compressing the image data according to the probability of the symbol occurrence by the entropy coding module to obtain coding data corresponding to the video data.
In some possible implementations, the FPGA device further includes an acquisition module communicatively coupled to the detection module, the apparatus further including:
the acquisition module is used for receiving video data sent by the acquisition equipment and transmitting the video data to the detection module.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
In addition, the embodiment of the invention also provides electronic equipment, which comprises: the processor, the memory, store the computer program on the memory and can run on the processor, this computer program realizes each course of the above-mentioned processing method embodiment of the video data when being carried out by the processor, and can reach the same technical result, in order to avoid repetition, will not be repeated here.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, realizes the processes of the video data processing method embodiment, and can achieve the same technical effects, and in order to avoid repetition, the description is omitted here. Wherein the computer readable storage medium is selected from Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.
Fig. 6 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present invention.
The electronic device 600 includes, but is not limited to: radio frequency unit 601, network module 602, audio output unit 603, input unit 604, sensor 605, display unit 606, user input unit 607, interface unit 608, memory 609, processor 610, and power supply 611. It will be appreciated by those skilled in the art that the structure of the electronic device according to the embodiments of the present invention is not limited to the electronic device, and the electronic device may include more or less components than those illustrated, or may combine some components, or may have different arrangements of components. In the embodiment of the invention, the electronic equipment comprises, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted terminal, a wearable device, a pedometer and the like.
It should be understood that, in the embodiment of the present invention, the radio frequency unit 601 may be used to receive and send information or signals during a call, specifically, receive downlink data from a base station, and then process the downlink data with the processor 610; and, the uplink data is transmitted to the base station. Typically, the radio frequency unit 601 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 601 may also communicate with networks and other devices through a wireless communication system.
The electronic device provides wireless broadband internet access to the user via the network module 602, such as helping the user to send and receive e-mail, browse web pages, and access streaming media, etc.
The audio output unit 603 may convert audio data received by the radio frequency unit 601 or the network module 602 or stored in the memory 609 into an audio signal and output as sound. Also, the audio output unit 603 may also provide audio output (e.g., a call signal reception sound, a message reception sound, etc.) related to a specific function performed by the electronic device 600. The audio output unit 603 includes a speaker, a buzzer, a receiver, and the like.
The input unit 604 is used for receiving audio or video signals. The input unit 604 may include a graphics processor (Graphics Processing Unit, GPU) 6041 and a microphone 6042, the graphics processor 6041 processing image data of still pictures or video obtained by an image capturing apparatus (such as a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 606. The image frames processed by the graphics processor 6041 may be stored in the memory 609 (or other storage medium) or transmitted via the radio frequency unit 601 or the network module 602. Microphone 6042 may receive sound and can process such sound into audio data. The processed audio data may be converted into a format output that can be transmitted to the mobile communication base station via the radio frequency unit 601 in the case of a telephone call mode.
The electronic device 600 also includes at least one sensor 605, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor and a proximity sensor, wherein the ambient light sensor can adjust the brightness of the display panel 6061 according to the brightness of ambient light, and the proximity sensor can turn off the display panel 6061 and/or the backlight when the electronic device 600 moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and direction when stationary, and can be used for recognizing the gesture of the electronic equipment (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; the sensor 605 may also include a fingerprint sensor, a pressure sensor, an iris sensor, a molecular sensor, a gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor, etc., which are not described herein.
The display unit 606 is used to display information input by a user or information provided to the user. The display unit 606 may include a display panel 6061, and the display panel 6061 may be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD), an Organic Light-Emitting Diode (OLED), or the like.
The user input unit 607 may be used to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the electronic device. Specifically, the user input unit 607 includes a touch panel 6071 and other input devices 6072. Touch panel 6071, also referred to as a touch screen, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on touch panel 6071 or thereabout using any suitable object or accessory such as a finger, stylus, or the like). The touch panel 6071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device and converts it into touch point coordinates, which are then sent to the processor 610, and receives and executes commands sent from the processor 610. In addition, the touch panel 6071 may be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. The user input unit 607 may include other input devices 6072 in addition to the touch panel 6071. Specifically, other input devices 6072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a track ball, a mouse, and a joystick, which are not described herein.
Further, the touch panel 6071 may be overlaid on the display panel 6061, and when the touch panel 6071 detects a touch operation thereon or thereabout, the touch operation is transmitted to the processor 610 to determine a type of a touch event, and then the processor 610 provides a corresponding visual output on the display panel 6061 according to the type of the touch event. It will be appreciated that in one embodiment, the touch panel 6071 and the display panel 6061 are two independent components for implementing the input and output functions of the electronic device, but in some embodiments, the touch panel 6071 and the display panel 6061 may be integrated to implement the input and output functions of the electronic device, which is not limited herein.
The interface unit 608 is an interface to which an external device is connected to the electronic apparatus 600. For example, the external devices may include a wired or wireless headset port, an external power (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 608 may be used to receive input (e.g., data information, power, etc.) from an external device and transmit the received input to one or more elements within the electronic apparatus 600 or may be used to transmit data between the electronic apparatus 600 and an external device.
The memory 609 may be used to store software programs as well as various data. The memory 609 may mainly include a storage program area that may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and a storage data area; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, the memory 609 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The processor 610 is a control center of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, and performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 609, and calling data stored in the memory 609, thereby performing overall monitoring of the electronic device. The processor 610 may include one or more processing units; preferably, the processor 610 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 610.
The electronic device 600 may also include a power supply 611 (e.g., a battery) for powering the various components, and preferably the power supply 611 may be logically coupled to the processor 610 via a power management system that performs functions such as managing charging, discharging, and power consumption.
In addition, the electronic device 600 includes some functional modules, which are not shown, and will not be described herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, etc.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (10)

1. A method for processing video data, which is applied to a terminal device, wherein the terminal device is configured with an FPGA device and a network port, the FPGA device at least comprises a detection module and an encoder communicatively connected with the detection module, and the method comprises:
responding to the received video data through the detection module, and if the video data meet the preset condition, transmitting the video data to the encoder;
encoding, by the encoder, the video data in response to the received video data, obtaining encoded data corresponding to the video data;
and transmitting the coded data to a corresponding receiving end through the network port.
2. The method of claim 1, wherein transmitting the video data to the encoder if the video data satisfies a preset condition comprises:
and if the time length of the received video data reaches the preset time length threshold value, transmitting the video data to the encoder.
3. The method according to claim 1 or 2, wherein the detection module comprises a number of data conversion units, the method further comprising, prior to transmitting video data to the encoder:
Acquiring a current data format of the video data through the data conversion unit;
if the current data format is different from the target data format, converting the video data into the target data format through the data conversion unit;
the target data format at least comprises a playing format corresponding to the receiving end.
4. The method of claim 3, wherein the data format comprises at least one of a color coding format and a resolution, the data conversion unit comprises at least a color resampling module, a color space conversion module, and an image resolution conversion module, and converting the video data into the target data format by the data conversion unit if the current data format is different from the target data format comprises:
if the current color coding format of the video data is not the target color coding format, converting the video data into the target color coding format through the color resampling module and the color space conversion unit;
and/or if the current resolution of the video data is not the target resolution, converting the video data into the target resolution through the image resolution conversion module.
5. The method according to claim 4, wherein the target color-coded format is a YUV422 coded format, and wherein if the current color-coded format of the video data is not the target color-coded format, converting the video data to the target color-coded format by the color resampling module and the color space conversion unit comprises:
if the video data is in the RGB coding format, converting the video data into the YUV coding format through the color space conversion module, and converting the converted video data into the YUV422 coding format through the color resampling module;
if the video data is in YUV444 encoding format or YUV420 encoding format, converting the video data into YUV422 encoding format by the color resampling module.
6. The method of claim 1, wherein the encoder comprises at least an image reordering module, a video pixel encoding module, an entropy encoding module, and an encoded data buffering module, wherein the encoding of the video data by the encoder in response to the received video data to obtain encoded data corresponding to the video data comprises:
Dividing the video data into a plurality of video sequences by the image reordering module, wherein the video sequences comprise a plurality of image groups;
converting pixel data in each image group into compressed format image data by the video pixel coding module, wherein the image data is marked by a symbol;
and compressing the image data according to the probability of the symbol occurrence by the entropy coding module to obtain coding data corresponding to the video data.
7. The method of claim 1, wherein the FPGA device further comprises an acquisition module communicatively coupled to the detection module, the method further comprising, prior to responding to the received video data by the detection module:
and receiving video data sent by the acquisition equipment through the acquisition module, and transmitting the video data to the detection module.
8. The video data processing device is characterized in that the device is provided with an FPGA device and a network port, wherein the FPGA device at least comprises a detection module and an encoder which is in communication connection with the detection module; wherein,
the detection module is used for responding to received video data, and if the video data meets preset conditions, the video data is transmitted to the encoder;
The encoder is used for responding to the received video data, encoding the video data and obtaining encoded data corresponding to the video data;
the network port is configured to send the encoded data to a corresponding receiving end.
9. An electronic device comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory communicate with each other via the communication bus;
the memory is used for storing a computer program;
the processor is configured to implement the method according to any one of claims 1-7 when executing a program stored on a memory.
10. A computer-readable storage medium having instructions stored thereon, which when executed by one or more processors, cause the processors to perform the method of any of claims 1-7.
CN202311235420.3A 2023-09-22 2023-09-22 Video data processing method and device, electronic equipment and storage medium Pending CN117425023A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311235420.3A CN117425023A (en) 2023-09-22 2023-09-22 Video data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311235420.3A CN117425023A (en) 2023-09-22 2023-09-22 Video data processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117425023A true CN117425023A (en) 2024-01-19

Family

ID=89527439

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311235420.3A Pending CN117425023A (en) 2023-09-22 2023-09-22 Video data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117425023A (en)

Similar Documents

Publication Publication Date Title
US11245932B2 (en) Encoding method and apparatus and decoding method and apparatus
CN102457544B (en) Method and system for acquiring screen image in screen sharing system based on Internet
CN112601096B (en) Video decoding method, device, equipment and readable storage medium
CN108810552B (en) Image prediction method and related product
CN108881920B (en) Method, terminal and server for transmitting video information
CN101047853A (en) Server apparatus and video transmission method
WO2018233411A1 (en) Prediction mode selection method, video encoding device and storage medium
CN109151469B (en) Video coding method, device and equipment
KR20140088924A (en) Display control apparatus and method for the fast display
CN110418209B (en) Information processing method applied to video transmission and terminal equipment
CN110677649B (en) Artifact removing method based on machine learning, artifact removing model training method and device
CN109474833B (en) Network live broadcast method, related device and system
KR100575984B1 (en) Method for serving data of thumb nail pictorial image in the mobile terminal
CN1901667A (en) Method for performing presentation in video telephone mode and wireless terminal implementing the same
CN117425023A (en) Video data processing method and device, electronic equipment and storage medium
KR100703354B1 (en) Method for transmitting image data in video telephone mode of wireless terminal
WO2022179600A1 (en) Video coding method and apparatus, video decoding method and apparatus, and electronic device
CN113630621B (en) Video processing method, related device and storage medium
CN112887293A (en) Streaming media processing method and device and electronic equipment
CN101232614A (en) Wireless image transmitting instrument and data processing method thereof
TWI511078B (en) Adjustment method for image processing and electronic device using the same
CN109003313B (en) Method, device and system for transmitting webpage picture
CN101534427A (en) Screen real-time migration method
CN116170633A (en) Decoding method, decoding device, terminal, storage medium, and program product
CN1960471A (en) Method and apparatus for searching moving pictures in portable terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication