CN115499667B - Video processing method, device, equipment and readable storage medium - Google Patents

Video processing method, device, equipment and readable storage medium Download PDF

Info

Publication number
CN115499667B
CN115499667B CN202211437518.2A CN202211437518A CN115499667B CN 115499667 B CN115499667 B CN 115499667B CN 202211437518 A CN202211437518 A CN 202211437518A CN 115499667 B CN115499667 B CN 115499667B
Authority
CN
China
Prior art keywords
data block
memory area
fifo
data
mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211437518.2A
Other languages
Chinese (zh)
Other versions
CN115499667A (en
Inventor
张贞雷
李拓
邹晓峰
满宏涛
周玉龙
魏红杨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Original Assignee
Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd filed Critical Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority to CN202211437518.2A priority Critical patent/CN115499667B/en
Publication of CN115499667A publication Critical patent/CN115499667A/en
Application granted granted Critical
Publication of CN115499667B publication Critical patent/CN115499667B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/423Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
    • H04N19/426Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements using memory downsizing methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking

Abstract

The application discloses a video processing method, a device, equipment and a readable storage medium, which are applied to the technical field of computers. After a video frame to be compressed is acquired, dividing all columns in the video frame into a plurality of groups according to a preset sampling mode, and alternately storing each group into a first memory area and a second memory area; starting from the first address of the first memory area, constructing data stored in a target address field in the first memory area into a first data block according to a preset sampling mode; simultaneously, starting from the first address of the second memory area, constructing data stored in an object address field in the second memory area into a second data block according to a preset sampling mode; and simultaneously performing compression operation on the first data block and the second data block. The video compression method and device can improve video compression efficiency, save buffer space and avoid frame loss in the compression process. The video processing device, the video processing equipment and the readable storage medium have the technical effects.

Description

Video processing method, device, equipment and readable storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a video processing method, apparatus, device, and readable storage medium.
Background
Currently, when video in a memory is compressed, video data needs to be converted into individual data blocks, and the process of converting the data blocks needs to use a buffer for temporarily storing components forming the data blocks.
However, since the buffer space is limited, and the components forming the data block need to be sequentially read from the buffer when the data block is converted in the existing scheme, the release speed of the buffer space resource is slower, and the video compression efficiency is reduced when the component data is sequentially read. And in the case where the buffer space is limited and the buffer space is released slowly, the buffer space is easily occupied. When the buffer space is not enough, the component data of the subsequent data block cannot be written into the buffer, the data which cannot be written into the buffer can be discarded, and the frame loss phenomenon can be seen.
Therefore, how to improve video compression efficiency, save buffer space, avoid frame loss in the compression process, and solve the problem that needs to be solved by those skilled in the art.
Disclosure of Invention
In view of the foregoing, an object of the present application is to provide a video processing method, apparatus, device and readable storage medium, so as to improve video compression efficiency, save buffer space, and avoid frame loss during compression. The specific scheme is as follows:
In a first aspect, the present application provides a video processing method, including:
acquiring a video frame to be compressed;
dividing all columns in the video frame into a plurality of groups according to a preset sampling mode, and alternately storing each group into a first memory area and a second memory area;
starting from the first address of the first memory area, constructing data stored in a target address field in the first memory area into a first data block according to the preset sampling mode; simultaneously, starting from the first address of the second memory area, and constructing data stored in an object address field in the second memory area into a second data block according to the preset sampling mode;
and simultaneously performing compression operation on the first data block and the second data block.
Optionally, the preset sampling mode is: YUV422 mode or YUV420 mode;
accordingly, the dividing all columns in the video frame into a plurality of groups according to a preset sampling mode includes:
and dividing all columns in the video frame into a plurality of groups with the number of columns being 16 according to the YUV422 mode or the YUV420 mode.
Optionally, the preset sampling mode is: YUV444 mode;
accordingly, the dividing all columns in the video frame into a plurality of groups according to a preset sampling mode includes:
And dividing all columns in the video frame into a plurality of groups with the number of columns being 8 according to the YUV444 mode.
Optionally, the alternately storing each group into the first memory area and the second memory area includes:
arranging each group according to the column sequence numbers of all columns in the video frame to obtain a group sequence;
storing groups with odd arrangement positions in the group sequence into the first memory area, and storing groups with even arrangement positions in the group sequence into the second memory area; or storing the groups with even arrangement positions in the group sequence into the first memory area, and storing the groups with odd arrangement positions in the group sequence into the second memory area.
Optionally, the preset sampling mode is: YUV422 mode or YUV420 mode;
correspondingly, the constructing the data stored in the target address segment in the first memory area into the first data block according to the preset sampling mode includes:
according to the YUV422 mode or the YUV420 mode, respectively reading Y components, U components and V components of each pixel point stored in the target address field to corresponding cache queues, and constructing a first data block based on each cache queue;
Correspondingly, the constructing the data stored in the object address segment in the second memory area into a second data block according to the preset sampling mode includes:
and respectively reading Y components, U components and V components of each pixel point stored in the object address field to corresponding cache queues according to the YUV422 mode or the YUV420 mode, and constructing a second data block based on each cache queue.
Optionally, the preset sampling mode is: YUV444 mode;
correspondingly, the constructing the data stored in the target address segment in the first memory area into the first data block according to the preset sampling mode includes:
according to the YUV444 mode, respectively reading Y components, U components and V components of each pixel point stored in the target address section to corresponding cache queues, and constructing a first data block based on each cache queue;
correspondingly, the constructing the data stored in the object address segment in the second memory area into a second data block according to the preset sampling mode includes:
and according to the YUV444 mode, reading Y components, U components and V components of each pixel point stored in the object address field to corresponding cache queues, and constructing a second data block based on each cache queue.
Optionally, after the compressing operation is performed on the first data block and the second data block at the same time, the method further includes:
and in the corresponding cache queues, the Y component, the U component and the V component of the first data block and the second data block are stored in an address space where the Y component, the U component and the V component of the subsequent new data block are located. Optionally, starting from the first address of the first memory area, constructing the data stored in the target address segment in the first memory area into a first data block according to the preset sampling mode; simultaneously, starting from the first address of the second memory area, before constructing the data stored in the object address field in the second memory area into a second data block according to the preset sampling mode, the method further comprises:
and if the video frame is in the RGB format, converting the RGB format data stored in the first memory area and the second memory area into the YUV format.
Optionally, the compressing the first data block and the second data block simultaneously includes:
and performing DCT transformation in a compression operation on the first data block and the second data block simultaneously.
Optionally, the method further comprises:
when the first data block and the second data block start DCT transformation, constructing the data stored in the next address segment in the first memory area into a new first data block according to the preset sampling mode, and constructing the data stored in the next address segment in the second memory area into a new second data block according to the preset sampling mode, so as to simultaneously compress the new first data block and the new second data block.
Optionally, the method further comprises:
after the first data block and the second data block are compressed to obtain compressed data, adding a frame identifier to the compressed data, and writing the compressed data added with the frame identifier into a preset memory area.
In a second aspect, the present application provides a video processing apparatus, comprising:
the acquisition module is used for acquiring the video frames to be compressed;
the storage module is used for dividing all columns in the video frame into a plurality of groups according to a preset sampling mode, and alternately storing each group into a first memory area and a second memory area;
the data block construction module is used for constructing data stored in a target address section in the first memory area into a first data block according to the preset sampling mode from the first address of the first memory area; simultaneously, starting from the first address of the second memory area, and constructing data stored in an object address field in the second memory area into a second data block according to the preset sampling mode;
and the compression module is used for simultaneously compressing the first data block and the second data block.
Optionally, the compression module includes:
a first compression unit for simultaneously performing DCT transformation in a compression operation on the first data block and the second data block;
And the second compression unit is used for constructing the data stored in the next address section in the first memory area into a new first data block according to the preset sampling mode when the first data block and the second data block start DCT conversion, and constructing the data stored in the next address section in the second memory area into a new second data block according to the preset sampling mode at the same time so as to simultaneously compress the new first data block and the new second data block.
Optionally, the storage module is specifically configured to: the preset sampling mode is as follows: and dividing all columns in the video frame into a plurality of groups with the number of columns being 16 according to a YUV422 mode or a YUV420 mode.
Optionally, the storage module is specifically configured to: the preset sampling mode is as follows: YUV444 mode; and dividing all columns in the video frame into a plurality of groups with the number of columns being 8 according to the YUV444 mode.
Optionally, the storage module is specifically configured to:
arranging each group according to the column sequence numbers of all columns in the video frame to obtain a group sequence; storing groups with odd arrangement positions in the group sequence into the first memory area, and storing groups with even arrangement positions in the group sequence into the second memory area; or storing the groups with even arrangement positions in the group sequence into the first memory area, and storing the groups with odd arrangement positions in the group sequence into the second memory area.
Optionally, the data block construction module is specifically configured to: the preset sampling mode is as follows: YUV422 mode or YUV420 mode; according to the YUV422 mode or the YUV420 mode, respectively reading Y components, U components and V components of each pixel point stored in the target address field to corresponding cache queues, and constructing a first data block based on each cache queue;
optionally, the data block construction module is specifically configured to: the preset sampling mode is as follows: YUV422 mode or YUV420 mode; and respectively reading Y components, U components and V components of each pixel point stored in the object address field to corresponding cache queues according to the YUV422 mode or the YUV420 mode, and constructing a second data block based on each cache queue.
Optionally, the data block construction module is specifically configured to: the preset sampling mode is as follows: YUV444 mode; according to the YUV444 mode, respectively reading Y components, U components and V components of each pixel point stored in the target address section to corresponding cache queues, and constructing a first data block based on each cache queue;
optionally, the preset sampling mode is: YUV444 mode; and according to the YUV444 mode, reading Y components, U components and V components of each pixel point stored in the object address field to corresponding cache queues, and constructing a second data block based on each cache queue.
Optionally, the method further comprises:
and the buffer multiplexing module is used for storing Y components, U components and V components of the subsequent new data blocks in address spaces where the Y components, the U components and the V components of the first data block and the second data block are located in corresponding buffer queues after the first data block and the second data block are subjected to the compression operation at the same time.
Optionally, the method further comprises:
the format conversion module is used for constructing data stored in a target address segment in the first memory area into a first data block according to the preset sampling mode from the first address of the first memory area if the video frame is in an RGB format; and simultaneously, starting from the first address of the second memory area, and converting the RGB format data stored in the first memory area and the second memory area into YUV format before constructing the data stored in the object address field in the second memory area into a second data block according to the preset sampling mode.
Optionally, the method further comprises:
and the compressed data storage module is used for adding a frame identifier to the compressed data after compressing the first data block and the second data block to obtain compressed data, and writing the compressed data added with the frame identifier into a preset memory area.
In a third aspect, the present application provides an electronic device, including:
a memory for storing a computer program;
and a processor for executing the computer program to implement the video processing method disclosed above.
In a fourth aspect, the present application provides a readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the video processing method disclosed above.
As can be seen from the above solution, the present application provides a video processing method, including: acquiring a video frame to be compressed; dividing all columns in the video frame into a plurality of groups according to a preset sampling mode, and alternately storing each group into a first memory area and a second memory area; starting from the first address of the first memory area, constructing data stored in a target address field in the first memory area into a first data block according to the preset sampling mode; simultaneously, starting from the first address of the second memory area, and constructing data stored in an object address field in the second memory area into a second data block according to the preset sampling mode; and simultaneously performing compression operation on the first data block and the second data block.
As can be seen, the present application uses two memory regions when storing video frames in memory. Specifically, all columns in a video frame are divided into a plurality of groups according to a preset sampling mode, and each group is alternately stored in a first memory area and a second memory area, so that all data in one video frame are separated to exist in two memory areas. The present application completes the storing process according to the preset sampling mode, so that the data stored in the two memory areas can provide a precondition for parallel execution for the subsequent compression operation. When converting a data block, constructing data stored in a target address section in a first memory area into the first data block according to a preset sampling mode from the first address of the first memory area; namely: and reading a section of data from the first address of the first memory area to construct and obtain a first data block. When the first data block is constructed, starting from the first address of the second memory area, constructing the data stored in the object address field in the second memory area into the second data block according to a preset sampling mode; namely: and reading a piece of data from the first address of the second memory area to construct and obtain a second data block. The application can realize the following steps: the construction of the two data blocks is completed at the same time, so that the buffer resources occupied by the two data blocks can be released as soon as possible, thereby saving the buffer space and avoiding the frame loss phenomenon. Because the construction of the two data blocks can be completed at the same time, the two data blocks can be compressed simultaneously, so that the compression efficiency is improved. Therefore, the video compression efficiency can be improved, the buffer space is saved, and frame loss in the compression process is avoided.
Accordingly, the video processing device, the video processing equipment and the readable storage medium have the technical effects.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings may be obtained according to the provided drawings without inventive effort to a person skilled in the art.
FIG. 1 is a flow chart of a video processing method disclosed in the present application;
FIG. 2 is a schematic diagram of a data structure of a first data block and a second data block disclosed in the present application;
FIG. 3 is a schematic diagram of a compression framework disclosed herein;
FIG. 4 is a schematic diagram of a video processing apparatus disclosed herein;
fig. 5 is a schematic diagram of an electronic device disclosed in the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
At present, when converting a data block, the existing scheme needs to sequentially read components forming the data block from a cache, so that the release speed of the cache space resource is relatively slow, and the video compression efficiency is reduced when component data are sequentially read. And in the case where the buffer space is limited and the buffer space is released slowly, the buffer space is easily occupied. When the buffer space is not enough, the component data of the subsequent data block cannot be written into the buffer, the data which cannot be written into the buffer can be discarded, and the frame loss phenomenon can be seen. Therefore, the video processing scheme can improve video compression efficiency, save buffer space and avoid frame loss in the compression process.
Referring to fig. 1, an embodiment of the present application discloses a video processing method, including:
s101, obtaining a video frame to be compressed.
S102, dividing all columns in the video frame into a plurality of groups according to a preset sampling mode, and alternately storing each group into a first memory area and a second memory area.
In this embodiment, two memory areas are used when storing video frames in memory. Specifically, all columns in a video frame are divided into a plurality of groups according to a preset sampling mode, and each group is alternately stored in a first memory area and a second memory area, so that all data in one video frame are separated to exist in two memory areas. The storage process is completed according to a preset sampling mode, so that the data stored in the two memory areas can provide a precondition for parallel execution for the subsequent compression operation.
The preset sampling mode may be YUV422 mode, YUV420 mode or YUV444 mode, and the sampling sizes specified by the different modes are different. The sample size specified by YUV422 mode and YUV420 mode is: 16, and YUV444 mode specifies a sample size of: 8. so when YUV422 mode or YUV420 mode is adopted, every 16 columns form a group; when YUV444 mode is used, every 8 columns form a group. In one embodiment, the preset sampling pattern is: YUV422 mode or YUV420 mode; accordingly, all columns in the video frame are divided into a plurality of groups according to a preset sampling pattern, including: all columns in the video frame are divided into a plurality of groups with the number of columns being 16 according to YUV422 mode or YUV420 mode. In one embodiment, the preset sampling pattern is: YUV444 mode; accordingly, all columns in the video frame are divided into a plurality of groups according to a preset sampling pattern, including: all columns in the video frame are divided into a plurality of groups with the number of columns being 8 according to YUV444 mode.
Assuming that a video frame has columns 0-95 and 96, if according to YUV422 or YUV420, columns 0-15 form a group, columns 16-31 form a group, columns 32-47 form a group … … and so on, 6 groups can be obtained. The 6 groups are alternately stored in the first memory area and the second memory area, namely: the method comprises the steps of storing a group 1 formed by columns 0-15 into a first memory area, storing a group 2 formed by columns 16-31 into a second memory area, storing a group 3 formed by columns 32-47 into the first memory area, storing a group 4 formed by columns 48-63 into the second memory area, storing a group 5 formed by columns 64-79 into the first memory area, and storing a group 6 formed by columns 80-95 into the second memory area. It can be seen that group 1, group 3, group 5 are stored in the first memory area, while group 2, group 4, group 6 are stored in the second memory area, thus realizing: each group is alternately stored in the first memory area and the second memory area. Thus in one embodiment, alternately storing each group in a first memory region and a second memory region comprises: arranging each group according to the column sequence numbers of all columns in the video frame to obtain a group sequence; storing groups with odd arrangement positions in the group sequence into a first memory area, and storing groups with even arrangement positions in the group sequence into a second memory area; or storing the groups with even arrangement positions in the group sequence into the first memory area, and storing the groups with odd arrangement positions in the group sequence into the second memory area.
If 96 columns of video frames are grouped according to YUV444 mode, columns 0-7 form a group, columns 8-15 form a group … … and so on, so that 12 groups can be obtained. The 12 groups are alternately stored in the first memory area and the second memory area, which can be implemented with reference to the above example, and will not be described herein.
S103, starting from the first address of the first memory area, constructing data stored in a target address section in the first memory area into a first data block according to a preset sampling mode; and simultaneously, starting from the first address of the second memory area, and constructing the data stored in the object address field in the second memory area into a second data block according to a preset sampling mode.
The data block size specified by the different modes is different. The data block sizes specified by YUV422 mode and YUV420 mode are: 16 x 16, whereas YUV444 mode specifies a data block size of: 8X 8. Therefore, when the YUV422 mode or the YUV420 mode is adopted, the first data block and the second data block are in a 16×16 specification. When YUV444 mode is adopted, the first data block and the second data block are in 8×8 specification.
In this embodiment, when a data block is constructed based on data in the first memory area, a piece of data is read from the first address of the first memory area, and based on the data read at this time, it can be determined that: and the Y component, the U component and the V component of each pixel point used for constructing the first data block are written into a cache queue, and then the first data block is constructed according to the data in the cache queue. When the first data block is constructed, a section of data is read from the first address of the second memory area at the same time, and based on the data read at the moment, the following can be determined: and the Y component, the U component and the V component of each pixel point used for constructing the second data block are written into the cache queue, and then the second data block is constructed according to the data in the cache queue. This can be achieved by: the construction of the two data blocks is completed at the same time, so that the buffer resources occupied by the two data blocks can be released as soon as possible, thereby saving the buffer space and avoiding the frame loss phenomenon.
The first memory area and the second memory area can store video pixels in YUV format and also can store video pixels in RGB format. If the first memory area and the second memory area store video pixels in YUV format, the Y component, U component and V component of each pixel point used for constructing the first data block and the second data block can be directly read from the first memory area and the second memory area. If the first memory area and the second memory area store video pixels in RGB format, after a piece of data is read from the first memory area and the second memory area, the read data needs to be converted into YUV format, so that it can be determined that: and the Y component, the U component and the V component are used for constructing each pixel point of the first data block and the second data block. However, this approach requires format conversion every time a piece of data is read from memory. Therefore, when the first memory area and the second memory area store the video pixels in RGB format, the video pixels in RGB format stored in the first memory area and the second memory area are firstly converted into YUV format, and then the data block is created. In one embodiment, the data stored in the target address segment in the first memory area is constructed as a first data block according to a preset sampling mode starting from the first address of the first memory area; simultaneously, starting from the first address of the second memory area, before constructing the data stored in the object address field in the second memory area into the second data block according to the preset sampling mode, the method further comprises: if the video frame is in RGB format, converting the RGB format data stored in the first memory area and the second memory area into YUV format.
In one embodiment, the preset sampling pattern is: YUV422 mode or YUV420 mode; correspondingly, constructing the data stored in the target address segment in the first memory area into a first data block according to a preset sampling mode, including: according to a YUV422 mode or a YUV420 mode, respectively reading Y components, U components and V components of each pixel point stored in a target address section to corresponding cache queues, and constructing a first data block based on each cache queue; correspondingly, constructing the data stored in the object address segment in the second memory area into a second data block according to the preset sampling mode, including: and respectively reading the Y component, the U component and the V component of each pixel point stored in the object address field to corresponding cache queues according to a YUV422 mode or a YUV420 mode, and constructing a second data block based on each cache queue. It is obvious from this that if the first memory area stores video pixels in YUV format, the data stored in the target address field is: the Y, U and V components of each pixel point used to construct the first data block. Accordingly, if the second memory area stores video pixels in YUV format, the data stored in the object address field is: the Y, U and V components of each pixel point used to construct the second data block. In YUV420 mode, the Y component input to the subsequent compression module is 16×16, the U/V component is 8×8, and the discarding of the U/V component is performed when writing YUV data into the respective FIFOs. For example, in YUV420 mode, only the U/V components of even rows and even columns are reserved, so that the U/V components are 8×8 when input to the post compression module. Accordingly, in YUV422 mode, the Y, U and V components are also discarded or retained according to rules established for the mode.
In one embodiment, the preset sampling pattern is: YUV444 mode; correspondingly, constructing the data stored in the target address segment in the first memory area into a first data block according to a preset sampling mode, including: according to a YUV444 mode, respectively reading Y components, U components and V components of each pixel point stored in a target address section to corresponding cache queues, and constructing a first data block based on each cache queue; correspondingly, constructing the data stored in the object address segment in the second memory area into a second data block according to the preset sampling mode, including: and according to the YUV444 mode, the Y component, the U component and the V component of each pixel point stored in the object address field are read to corresponding cache queues, and a second data block is constructed based on each cache queue. It is obvious from this that if the first memory area stores video pixels in YUV format, the data stored in the target address field is: the Y, U and V components of each pixel point used to construct the first data block. Accordingly, if the second memory area stores video pixels in YUV format, the data stored in the object address field is: the Y, U and V components of each pixel point used to construct the second data block. In YUV422 mode, only the even columns of U/V components are retained, so the U/V input to the post compression is 16 x 8. It can be seen that when forming data blocks in different modes, the Y, U and V components all need to be discarded or retained according to the rules established for the respective modes.
Of course, when the first memory area and the second memory area store video pixels in RGB format, the video pixels in RGB format in the first memory area and the second memory area may be uniformly converted into YUV format before the data block is constructed, or after each section of pixel for constructing the data block is read, the currently read pixel may be converted into YUV format, so that Y component, U component and V component for each pixel point of the data block can be obtained.
In one example, the data composition of the first data block and the second data block may refer to fig. 2. As shown in fig. 2, if according to YUV422 mode or YUV420 mode, BLOCK0 (first data BLOCK) corresponds to the 0 th to 15 th rows×0 th to 15 th columns of the video frame, BLOCK1 (first second data BLOCK) corresponds to the 0 th to 15 th rows×16 th to 31 th columns of the video frame, BLOCK2 (second first data BLOCK) corresponds to the 0 th to 15 th rows×32 th to 47 th columns … … of the video frame, and so on. If according to YUV444 mode, BLOCK0 (first data BLOCK) corresponds to lines 0-7 x columns 0-7 of the video frame, BLOCK1 (first second data BLOCK) corresponds to lines 0-15 x columns 8-15 of the video frame … …, and so on. The first data BLOCKs BLOCK0, BLOCK2, and BLOCK4 … … are stored in the first memory area with component data constituting them. The second data BLOCKs BLOCK1, BLOCK3, and BLOCK5 … … are stored in the second memory area with component data constituting them.
The embodiment can realize the following steps: the construction of the two data blocks is completed at the same time, so that the cache resources occupied by the two data blocks can be released as soon as possible. Thus, in one embodiment, after the compression operation is performed on the first data block and the second data block simultaneously, the method further includes: in the corresponding buffer queues, the Y component, the U component and the V component of the first data block and the second data block are enabled to store the Y component, the U component and the V component of the subsequent new data block in an address space, so that the buffer queues can be continuously used for constructing the new data block, multiplexing of buffer resources occupied by the data blocks which enter the compression module is realized, the buffer space can be saved, and the phenomenon of frame loss is avoided.
S104, simultaneously performing compression operation on the first data block and the second data block.
Because the embodiment can complete the construction of two data blocks at the same time, the two data blocks can be compressed simultaneously later, so that the compression efficiency is improved. In one embodiment, the compressing operation is performed on the first data block and the second data block simultaneously, including: the DCT transforms in the compression operation are performed simultaneously on the first data block and the second data block. As can be seen, the DCT transform is included in the compression operation.
Since the DCT transformation process takes a lot of time, if the DCT transformation of the first data block and the second data block is completed, the compression of the subsequent data block is performed, and the compression time is increased. For this reason, in this embodiment, when the first data block and the second data block start DCT transformation, the data stored in the next address field in the first memory area is constructed as a new first data block according to the preset sampling mode, and at the same time, the data stored in the next address field in the second memory area is constructed as a new second data block according to the preset sampling mode, so as to perform compression operation on the new first data block and the new second data block at the same time. That is: after the first data block and the second data block are read by the DCT conversion module, the construction of reading the subsequent data block is started, and at the moment, another DCT conversion module can be arranged, so that the subsequent data block can be DCT converted by the other DCT conversion module. In this way, the DCT conversion of the first data block and the second data block is not required to be waited, and the DCT conversion of the subsequent other data blocks can be performed in the DCT conversion process of the first data block and the second data block, so that the compression time can be shortened, and the compression efficiency can be improved.
Referring to fig. 2, in one example, two DCT transform modules may be provided for all first data blocks: DCT0 and DCT2, two DCT transform modules are provided for all second data blocks simultaneously: DCT1 and DCT3, then the following steps may be implemented: when DCT0 finishes reading BLOCK0 and prepares to enter the DCT transformation flow, DCT2 starts to read BLOCK2 to carry out DCT transformation on BLOCK2, so that DCT transformation of BLOCK0 and BLOCK2 is repeated in time, and time can be saved. When DCT0 reads BLOCK0, DCT1 also reads BLOCK1, and DCT3 starts to read BLOCK3 to DCT-transform BLOCK3 when DCT1 reads BLOCK1 to enter the DCT transformation flow, DCT transformation of BLOCK1 and BLOCK3 is repeated in time. Since BLOCK0 and BLOCK1 start DCT transform at the same time, BLOCK0, BLOCK1, BLOCK2, BLOCK3 may have DCT transform time repetition on the basis that BLOCK0 and BLOCK2 have DCT transform time repetition. It can be seen that the present embodiment can perform DCT transform on a plurality of data blocks in parallel, thereby improving compression efficiency. Of course, more DCT conversion modules are arranged, so that the compression efficiency can be further improved.
In one embodiment, after the first data block and the second data block are compressed to obtain compressed data, a frame identifier is added to the compressed data, and the compressed data added with the frame identifier is written into a preset memory area.
It can be seen that this embodiment uses two memory areas when storing video frames in memory. Specifically, all columns in a video frame are divided into a plurality of groups according to a preset sampling mode, and each group is alternately stored in a first memory area and a second memory area, so that all data in one video frame are separated to exist in two memory areas. The present embodiment completes the storing process according to the preset sampling mode, so that the data stored in the two memory areas can provide a precondition for parallel execution for the subsequent compression operation. In this embodiment, when converting a data block, a section of data is read from the first address of the first memory area, so as to construct and obtain the first data block. And when the first data block is constructed, simultaneously reading a piece of data from the first address of the second memory area to construct and obtain the second data block. This embodiment can thus realize: the construction of the two data blocks is completed at the same time, so that the buffer resources occupied by the two data blocks can be released as soon as possible, thereby saving the buffer space and avoiding the frame loss phenomenon. Because the embodiment can complete the construction of two data blocks at the same time, the two data blocks can be compressed simultaneously later, so that the compression efficiency is improved. Therefore, the video compression efficiency can be improved, the buffer space is saved, and frame loss in the compression process is avoided.
When the baseboard management control system in the server compresses video, the host side transfers video data to VGA (Video Graphics Array ) in the baseboard management control system via PCIe, and the VGA writes the video data to the host DDR. Then, a read control module (RD_CTRL) reads data in the DDR, then converts the video data in the original RGB format into data in the YUV format through a color space conversion module (RGB 2 YUV), and then carries out FIFO buffering on Y, U, V components by using storage resources in a system according to the BLOCK format conversion requirement, thereby completing the format conversion of YUV2 BLOCK. And the compression module reads each BLOCK according to the sequence of the BLOCKs to compress, writes the compressed data into the DDR after the compression is completed, and sends the compressed data to the far end through the MAC. The remote end may display the video data.
The FIFO buffer needs to set a proper depth and width according to the video resolution. For example: at a resolution of 1920 x 1200, the FIFO depth is 16384 and the FIFO width is 8bits, so that the FIFO is not full. Width means: the data size written into the FIFO buffer memory each time; depth refers to: the total number of 8bits of data that can be stored in the FIFO buffer.
In this embodiment, when the VGA writes the RGB raw data into the DDR, the VGA writes the RGB raw data into two memory areas, namely space_low and space_high, respectively, and the compression module in this embodiment supports YUV444, YUV422, and YUV420 modes. Space_low is: the first memory area described in the above embodiment; space_high is the second memory area described in the above embodiment. Of course, the reverse may also be used to consider space_low as the second memory area described in the above embodiment, and space_high as the first memory area described in the above embodiment.
If the compression module adopts YUV422 or YUV420 mode, writing RGB data of 0-15 columns, 32-47 columns and 64-79 columns … … of all rows of a video frame into space_LOW SPACE; the RGB data of columns 16-31, 48-63, 80-95, and … … of all rows of the video frame are written into the space_high SPACE at the same time.
If the compression module adopts YUV444 mode, writing RGB data of 0-7 columns, 16-23 columns and 32-39 columns … … of all rows of one video frame into space_LOW SPACE; RGB data of 8 th to 15 th columns, 24 th to 31 th columns, 40 th to 47 th columns … … of all rows of the video frame is written into the space_high SPACE at the same time.
Taking YUV420 mode as an example, if RGB data is read from DDR, data in space_low SPACE is first read, 16 pieces of RGB data are read at a time, and then the data is input to YUV2block_new_0 through color SPACE conversion of RGB2 YUV. And reading data in the space_high SPACE for the second time, reading 16 pieces of RGB data at a time, and inputting the data to YUV2BLOCK_NEW_1 after the color SPACE conversion of RGB2 YUV. The above steps are repeated to send the original RGB data into YUV2BLOCK_NEW_0 and YUV2BLOCK_NEW_1, respectively. The YUV2block_new_0 and YUV2block_new_1 are two YUV-to-BLOCK modules, and the two modules correspond to two memory areas, space_low and space_high, respectively. In this way, the two YUV-to-BLOCK modules can synchronously perform BLOCK conversion, thereby providing a precondition for synchronous compression of subsequent BLOCKs.
In the YUV422 mode, 16 pieces of RGB data are read at a time. In YUV444 mode, 8 RGB data are read at a time.
In the prior art, 16 y_fifos, 16 u_fifos, and 16 v_fifos need to be set to complete BLOCK conversion. The present embodiment sets 32Y component cache queues: Y_FIFO_0_A to Y_FIFO_15_A, Y_FIFO_0_B to Y_FIFO_15_B; 32U component cache queues: U_FIFO_0_A to U_FIFO_15_A, U_FIFO_0_B to U_FIFO_15_B; 32V component cache queues: V_FIFO_0_A to V_FIFO_15_A, V_FIFO_0_B to V_FIFO_15_B. In different sampling modes, the number of the used cache queues is different.
In one example, Y_FIFO_0_A-Y_FIFO_15_A is used to buffer the Y component that constitutes BLOCK0, and Y_FIFO_0_B-Y_FIFO_15_B is used to buffer the Y component that constitutes BLOCK 2. U_FIFO_0_A-U_FIFO_15_A is used for buffering U components forming BLOCK0, U_FIFO_0_B-U_FIFO_15_B is used for buffering U components forming BLOCK2, V_FIFO_0_A-V_FIFO_15_A is used for buffering V components forming BLOCK0, and V_FIFO_0_B-V_FIFO_15_B is used for buffering V components forming BLOCK 2. Accordingly, the embodiment can synchronously construct the BLOCK0 and the BLOCK2, thereby providing a precondition for synchronous compression of the BLOCK0 and the BLOCK 2. Accordingly, this embodiment can construct BLOCK4 and BLOCK6, BLOCK8 and BLOCK10, BLOCK1 and BLOCK3, BLOCK5 and BLOCK7, and so on simultaneously.
In YUV420 mode, all Y components need to be cached, while the even row and even column U/V components are cached. Then, when caching Y components of all rows and all columns, first, for all Y components of the 0/16/32 … … th row of the video frame, write the Y components of the 0 th-15 th column into the y_fifo_0_A of YUV2 block_new_0; writing the Y component of columns 16-31 into y_fifo_0_A of YUV2 block_new_1; writing the Y component of columns 32-47 into y_fifo_0_B of YUV2 block_new_0; writing the Y component of columns 48-63 into Y_FIFO_0_B of YUV2 BLOCK_NEW_1; writing the Y component of columns 64-79 into Y_FIFO_0_A of YUV2 BLOCK_NEW_0; y component of columns 80-95 into y_fifo_0_A of YUV2 block_new_1; writing the Y component of columns 96-111 into y_fifo_0_B of YUV2 block_new_0; the Y component of columns 112-127 is written into the Y FIFO 0_B … … other columns of YUV2block_new_1 and so on until the caching of all Y components of row 0/16/32 … … is completed. All Y components of the 1/17/33 … … line of the video frame are cached according to the corresponding rule, and the cache queue is changed into: y_fifo_1_a and y_fifo_ 1_B. Accordingly, for all Y components of line 15/31/47/… … of the video frame, the buffering is also completed according to the corresponding rule, and the buffering queue is changed to: y_fifo_15_a, y_fifo_15_b.
In YUV420 mode, when buffering U components of even lines and even columns, firstly, aiming at U components of 0/16/32 … … th line of a video frame, writing U components of 0/2/4/6 … th column into U_FIFO_0_A of YUV2 BLOCK_NEW_0; the U component of column 16/18/20 … is written into U_FIFO_0_A of YUV2 BLOCK_NEW_1; the U component of column 32/34/36 … is written into U_FIFO_0_B of YUV2 BLOCK_NEW_0; the U component of column 48/50/52 … is written into U_FIFO_0_B of YUV2 BLOCK_NEW_1; the U component of column 64/66/68 … is written into U_FIFO_0_A of YUV2 BLOCK_NEW_0; u components of columns 80/82/84 … are fed into U_FIFO_0_A of YUV2 BLOCK_NEW_1; the U component of column 96/98/100 …, U.S. column 110, is written into U_FIFO_0_B of YUV2 BLOCK_NEW_0; the U component of column 112/114/116/… is written into U_FIFO_0_B … … of YUV2BLOCK_NEW_1 and so on until the buffering of all U components of line 0/16/32 … … is completed. For U component of line 2/18/34 … …, the buffer is completed according to the corresponding rule, and the buffer queue is changed into: U_FIFO_1_A and U_FIFO_1_B complete the buffer storage according to the corresponding rule for the U component of the 14/30/46 … … th line, and the buffer storage queue is changed to: U_FIFO_7_A/U_FIFO_7_B.
In YUV420 mode, the V component of the even line and even column is V_FIFO_0_A of YUV2BLOCK_NEW_0, wherein the V component of the 0/2/4/6 … 14 column is written into the V component of the 0/16/32 … … line; writing the V component of column 16/18/20 … to v_fifo_0_A of YUV2 block_new_1; the V component of column 32/34/36 … is written into v_fifo_0_B of YUV2 block_new_0; the V component of column 48/50/52 … is written into v_fifo_0_B of YUV2 block_new_1; writing the V component of column 64/66/68 … into v_fifo_0_A of YUV2 block_new_0; v component of 80/82/84 … column 94 is fed into V_FIFO_0_A of YUV2 BLOCK_NEW_1; the V component of column 96/98/100 …, V component is written into v_fifo_0_B of YUV2 block_new_0; the V component of column 112/114/116/… is written into the V_FIFO 0_B … … of YUV2BLOCK_NEW_1 and so on until the buffering of all V components of line 0/16/32 … … is completed. For the V component of line 2/18/34 … …, the caching is completed according to the corresponding rule, and the caching queue is changed to: v_fifo_1_a, u_fifo_1_B. For the V component of line 14/30/46 and … …, the caching is completed according to the corresponding rule, and the caching queue is changed to: v_fifo_7_A, v_fifo_7_B.
In YUV422 mode, it is necessary to buffer all Y components while buffering even columns of U/V components. Then, when caching the Y components of all rows and all columns, first, for the Y component of row 0/16/32 …, the Y component of columns 0-15 is written into the y_fifo_0_A of YUV2 block_new_0; writing the Y component of columns 16-31 into y_fifo_0_A of YUV2 block_new_1; writing the Y component of columns 32-47 into y_fifo_0_B of YUV2 block_new_0; writing the Y component of columns 48-63 into Y_FIFO_0_B of YUV2 BLOCK_NEW_1; writing the Y component of columns 64-79 into Y_FIFO_0_A of YUV2 BLOCK_NEW_0; y component of columns 80-95 into y_fifo_0_A of YUV2 block_new_1; writing the Y component of columns 96-111 into y_fifo_0_B of YUV2 block_new_0; the Y component of columns 112-127 is written into the Y FIFO 0_B … … other columns of YUV2block_new_1 and so on until the caching of all Y components of row 0/16/32 … … is completed. For all Y components of line 1/17/33 … …, caching is also completed according to the corresponding rule, and the caching queue is changed to: y_fifo_1_a, y_fifo_1_B. For all Y components of line 15/31/47 … …, caching is also completed according to the corresponding rule, and the caching queue is changed to: y_fifo_15_a/y_fifo_15_b.
In YUV422 mode, the U components of even columns are buffered, so the U components for row 0/16/32 …; u components of columns 0/2/4/6 … are written into U_FIFO_0_A of YUV2 BLOCK_NEW_0; the U component of column 16/18/20 … is written into U_FIFO_0_A of YUV2 BLOCK_NEW_1; the U component of column 32/34/36 … is written into U_FIFO_0_B of YUV2 BLOCK_NEW_0; the U component of column 48/50/52 … is written into U_FIFO_0_B of YUV2 BLOCK_NEW_1; the U component of column 64/66/68 … is written into U_FIFO_0_A of YUV2 BLOCK_NEW_0; u components of columns 80/82/84 … are fed into U_FIFO_0_A of YUV2 BLOCK_NEW_1; the U component of column 96/98/100 …, U.S. column 110, is written into U_FIFO_0_B of YUV2 BLOCK_NEW_0; the U component of column 112/114/116/… is written into U_FIFO_0_B … … of YUV2BLOCK_NEW_1 and so on until the buffering of all U components of line 0/16/32 … … is completed. For all U components of line 1/17/33 … …, the caching is completed according to the corresponding rule, and the caching queue is changed to: U_FIFO_1_A/U_FIFO_1_B. For all U components of line 15/31/47/… …, caching is also completed according to the corresponding rule, and the caching queue is changed to: U_FIFO_15_A, U_FIFO_15_B.
In YUV422 mode, the V components of even columns are buffered, so for the V components of row 0/16/32 …, the V components of row 0/2/4/6 … are written into V_FIFO_0_A of YUV2 BLOCK_NEW_0; writing the V component of column 16/18/20 … to v_fifo_0_A of YUV2 block_new_1; the V component of column 32/34/36 … is written into v_fifo_0_B of YUV2 block_new_0; the V component of column 48/50/52 … is written into v_fifo_0_B of YUV2 block_new_1; writing the V component of column 64/66/68 … into v_fifo_0_A of YUV2 block_new_0; v component of 80/82/84 … column 94 is fed into V_FIFO_0_A of YUV2 BLOCK_NEW_1; the V component of column 96/98/100 …, V component is written into v_fifo_0_B of YUV2 block_new_0; the V component of column 112/114/116/… is written into the V_FIFO 0_B … … of YUV2BLOCK_NEW_1 and so on until the buffering of all V components of line 0/16/32 … … is completed. For all V components of line 1/17/33 … …, the caching is completed according to the corresponding rule, and the caching queue is changed to: v_fifo_1_a, v_fifo_1_B. For all V components of line 15/31/47 … …, caching is also completed according to the corresponding rule, and the caching queue is changed to: v_fifo_15_a, v_fifo_15_b.
In YUV444 mode, Y/U/V data of all rows and all columns is cached. When caching Y components of all rows and columns, first writing Y components of 0 th row/8 th/16 th … th row and Y components of 0 th to 7 th columns into Y_FIFO_0_A of YUV2 BLOCK_NEW_0; writing the Y component of columns 8-15 into y_fifo_0_A of YUV2 block_new_1; writing the Y component of columns 16-23 into y_fifo_0_B of YUV2 block_new_0; writing the Y component of columns 24-31 into y_fifo_0_B of YUV2 block_new_1; writing the Y component of columns 32-39 into y_fifo_0_A of YUV2 block_new_0; y component of columns 40-47 into y_fifo_0_A of YUV2 block_new_1; writing the Y component of columns 48-55 into Y_FIFO_0_B of YUV2 BLOCK_NEW_0; the Y component of columns 56-63 is written into the Y FIFO 0_B … … other columns of YUV2BLOCK NEW 1 and so on until the caching of all Y components of row 0/8/16 … is completed. For all Y components of line 1/9/17 … …, caching is also completed according to the corresponding rule, and the caching queue is changed to: y_fifo_1_a, y_fifo_1_B. For all Y components of line 7/15/23 … …, caching is also completed according to the corresponding rule, and the caching queue is changed to: y_fifo_7_A, y_fifo_7_B.
In YUV444 mode, when caching U components of all rows and all columns, firstly, for U components of 0 th row/8/16 th … row, writing U components of 0 th to 7 th columns into U_FIFO_0_A of YUV2 BLOCK_NEW_0; writing the U component of columns 8-15 into U_FIFO_0_A of YUV2 BLOCK_NEW_1; writing the U component of columns 16-23 into U_FIFO_0_B of YUV2 BLOCK_NEW_0; writing U components of columns 24-31 into U_FIFO_0_B of YUV2 BLOCK_NEW_1; writing U components of columns 32-39 into U_FIFO_0_A of YUV2 BLOCK_NEW_0; u components of columns 40-47 are processed into U_FIFO_0_A of YUV2 BLOCK_NEW_1; writing the U component of columns 48-55 into U_FIFO_0_B of YUV2 BLOCK_NEW_0; the U components of columns 56-63 are written into the U_FIFO_0_B … … of YUV2BLOCK_NEW_1 and so on until the caching of all U components of row 0/8/16 … is completed. For all U components of line 1/9/17 … …, caching is completed according to the corresponding rule, and the caching queue is changed into: U_FIFO_1_A, U_FIFO_1_B. For all U components of line 7/15/23 … …, caching is completed according to the corresponding rule, and the caching queue is changed to: u_fifo_7_A, u_fifo_7_B.
In YUV444 mode, when V components of all rows and all columns are cached, V components of 0 th row/8 th/16 th … th row are written into V_FIFO_0_A of YUV2 BLOCK_NEW_0; writing the V component of columns 8-15 into v_fifo_0_A of YUV2 block_new_1; writing the V component of columns 16-23 into v_fifo_0_B of YUV2 block_new_0; writing the V component of columns 24-31 into v_fifo_0_B of YUV2 block_new_1; writing the V component of columns 32-39 into v_fifo_0_A of YUV2 block_new_0; v component of columns 40-47 is fed to V_FIFO_0_A of YUV2 BLOCK_NEW_1; writing the V component of columns 48-55 into v_fifo_0_B of YUV2 block_new_0; the V components of columns 56-63 are written into the V FIFO 0_B … … other columns of YUV2block_new_1 and so on until the caching of all V components of row 0/8/16 … is completed. For all V components of line 1/9/17 … …, the caching is completed according to the corresponding rule, and the caching queue is changed to: v_fifo_1_a, v_fifo_1_B. For all V components of line 7/15/23 … …, caching is also completed according to the corresponding rule, and the caching queue is changed to: v_fifo_7_A, v_fifo_7_B.
It can be seen that in different sampling modes, using 32Y component cache queues, 32U component cache queues, 32V component cache queues can make: one conversion module (YUV 2block_new_0 or YUV2 block_new_1) synchronously constructs two BLOCKs. Therefore, the embodiment can realize synchronous compression of data, namely: YUV2block_new_0 and YUV2block_new_1 are simultaneously constructed to obtain block_0 and block_1, so that block_0 and block_1 are synchronously compressed; meanwhile, YUV2BLOCK_NEW_0 constructs BLOCK_0 and BLOCK_2 at the same time, and the compression of BLOCK_0 and BLOCK_2 is repeated for a certain time, so that BLOCK_0 and BLOCK_2 are compressed incompletely and synchronously; similarly, YUV2block_new_1 constructs block_1 and block_3 at the same time, and compression of block_1 and block_3 is repeated for a certain time, so that block_1 and block_3 are compressed incompletely and synchronously, thereby improving compression efficiency.
The buffer is read in YUV420 mode according to the following procedure: Y_FIFO_0_A in YUV2BLOCK_NEW_0 is read 16 times, Y_FIFO_0_A … … in YUV2BLOCK_NEW_1 is synchronously read 16 times, Y_FIFO_15_A in YUV2BLOCK_NEW_0 is synchronously read 16 times; y_fifo_15_a in YUV2block_new_1 is read synchronously 16 times, thereby reading a set of Y components of 16×16 to construct BLOCK data of 16×16. Read u_fifo_0_A in YUV2block_new_0 8 times, read u_fifo_0_A … … in YUV2block_new_1 8 times synchronously, read u_fifo_7_A in YUV2block_new_0 8 times synchronously, read u_fifo_7_A in YUV2block_new_1 times synchronously, thereby reading a set of 8 x 8U components to construct 8 x 8 BLOCK data. V_fifo_0_A in YUV2block_new_0 is read 8 times, v_fifo_0_A … … in YUV2block_new_1 is read 8 times synchronously, v_fifo_7_A in YUV2block_new_0 is read 8 times synchronously, v_fifo_7_A in YUV2block_new_1 is read 8 times synchronously, and thus a set of 8×8V components is read to constitute 8×8 BLOCK data. It can be seen that BLOCK data can be constructed accordingly based on the read data, and subsequently can be fed into the compression module.
Referring to fig. 3, the compression frame provided according to the present embodiment includes: two BLOCK conversion modules: YUV2block_new_0 (i.e., conversion module 0) and YUV2block_new_1 (i.e., conversion module 1); two compression modules: compression module 0 and compression module 1. Also, the compression module 0 includes two DCT units: DCT0 and DCT2, the compression module 1 comprises two DCT units: DCT1 and DCT3.
Specifically, YUV2block_new_0 generates the first BLOCK0, BLOCK0 is sent to the dct_0 unit in compression BLOCK0 for DCT change processing, and since the processing time of DCT transform is long, when dct_0 in compression BLOCK0 processes BLOCK0, the arbitration BLOCK generates again the read timing of YUV2block_new_0, and at this time, reads y_fifo_0_B and y_fifo_1_B … … to construct block_2, and block_2 is input to dct_1 in compression BLOCK0, so that BLOCK0 and BLOCK2 have compression repetition time, and thus can be regarded as parallel compression.
Since YUV2block_new_1 will generate BLOCK1 simultaneously when YUV2block_new_0 generates BLOCK0, BLOCK0 and BLOCK1 can be compressed simultaneously. Accordingly, BLOCK1 and BLOCK3 also have compression repetition times, so there is a probability that BLOCK0, BLOCK1, BLOCK2, BLOCK4 are compressed simultaneously. In this way, each unit in the YUV2BLOCK module and the compression module can be fully utilized, so that the data processing speed is greatly increased. Wherein the other units in the compression module include: quantization unit, entropy coding unit, framing unit, etc. The quantization unit is capable of obtaining a higher compression ratio with a smaller quantization interval and a smaller number of bits for a low frequency component than for a high frequency component, while ensuring image quality. The entropy coding unit is capable of further compressing the video image and coding different code lengths according to probability distributions of different symbols. The framing unit is able to determine an image start flag, an image end flag, a frame header, etc. Because there are two compression modules to output compressed data synchronously in this embodiment, it is necessary to add frame header and frame end to the compressed data, frame them and output them to DDR sequentially.
Therefore, the embodiment can synchronously compress the data in space_low and space_high SPACEs in DDR, and the adjacent BLOCK (such as BLOCK0 and BLOCK 2) constructed by one YUV2BLOCK module has repetition in compression time, so that the speed of JPEG video compression in the baseboard management control chip is greatly accelerated, the frame loss rate is reduced, the buffer time of the data in the chip is reduced, the occupation of the video compression function to the on-chip resource SPACE is reduced, and the integral performance of the chip can be improved.
A video processing apparatus according to an embodiment of the present application is described below, and a video processing apparatus described below and a video processing method described above may be referred to each other.
Referring to fig. 4, an embodiment of the present application discloses a video processing apparatus, including:
an acquisition module 401, configured to acquire a video frame to be compressed;
the storage module 402 is configured to divide all columns in the video frame into a plurality of groups according to a preset sampling mode, and alternately store each group into a first memory area and a second memory area;
a data block construction module 403, configured to construct, from a first address of the first memory area, data stored in a target address field in the first memory area as a first data block according to a preset sampling mode; simultaneously, starting from the first address of the second memory area, constructing data stored in an object address field in the second memory area into a second data block according to a preset sampling mode;
And the compression module 404 is configured to perform a compression operation on the first data block and the second data block simultaneously.
In one embodiment, the compression module includes:
a first compression unit for performing DCT transformation in a compression operation on the first data block and the second data block at the same time;
and the second compression unit is used for constructing the data stored in the next address segment in the first memory area into a new first data block according to a preset sampling mode when the first data block and the second data block start DCT conversion, and constructing the data stored in the next address segment in the second memory area into a new second data block according to the preset sampling mode at the same time so as to simultaneously perform compression operation on the new first data block and the new second data block.
In one embodiment, the storage module is specifically configured to: the preset sampling mode is as follows: YUV422 mode or YUV420 mode, and dividing all columns in the video frame into a plurality of groups with 16 columns according to the YUV422 mode or the YUV420 mode.
In one embodiment, the storage module is specifically configured to: the preset sampling mode is as follows: YUV444 mode; all columns in the video frame are divided into a plurality of groups with the number of columns being 8 according to YUV444 mode.
In one embodiment, the storage module is specifically configured to:
arranging each group according to the column sequence numbers of all columns in the video frame to obtain a group sequence; storing groups with odd arrangement positions in the group sequence into a first memory area, and storing groups with even arrangement positions in the group sequence into a second memory area; or storing the groups with even arrangement positions in the group sequence into the first memory area, and storing the groups with odd arrangement positions in the group sequence into the second memory area.
In one embodiment, the data block construction module is specifically configured to: the preset sampling mode is as follows: YUV422 mode or YUV420 mode; according to a YUV422 mode or a YUV420 mode, respectively reading Y components, U components and V components of each pixel point stored in a target address section to corresponding cache queues, and constructing a first data block based on each cache queue;
in one embodiment, the data block construction module is specifically configured to: the preset sampling mode is as follows: YUV422 mode or YUV420 mode; and respectively reading the Y component, the U component and the V component of each pixel point stored in the object address field to corresponding cache queues according to a YUV422 mode or a YUV420 mode, and constructing a second data block based on each cache queue.
In one embodiment, the data block construction module is specifically configured to: the preset sampling mode is as follows: YUV444 mode; according to a YUV444 mode, respectively reading Y components, U components and V components of each pixel point stored in a target address section to corresponding cache queues, and constructing a first data block based on each cache queue;
in one embodiment, the preset sampling pattern is: YUV444 mode; and according to the YUV444 mode, the Y component, the U component and the V component of each pixel point stored in the object address field are read to corresponding cache queues, and a second data block is constructed based on each cache queue.
In one embodiment, the method further comprises:
and the buffer multiplexing module is used for storing Y components, U components and V components of subsequent new data blocks in address spaces where the Y components, the U components and the V components of the first data block and the second data block are located in corresponding buffer queues after the first data block and the second data block are subjected to compression operation simultaneously.
In one embodiment, the method further comprises:
the format conversion module is used for constructing data stored in a target address section in the first memory area into a first data block according to a preset sampling mode from the first address of the first memory area if the video frame is in an RGB format; and simultaneously, starting from the first address of the second memory area, and converting the RGB format data stored in the first memory area and the second memory area into YUV format before constructing the data stored in the object address field in the second memory area into the second data block according to the preset sampling mode.
In one embodiment, the method further comprises:
and the compressed data storage module is used for adding a frame identifier to the compressed data after compressing the first data block and the second data block to obtain compressed data, and writing the compressed data added with the frame identifier into a preset memory area.
The more specific working process of each module and unit in this embodiment may refer to the corresponding content disclosed in the foregoing embodiment, and will not be described herein.
Therefore, the embodiment provides a video processing device, which can improve video compression efficiency, save buffer space and avoid frame loss in the compression process.
An electronic device provided in an embodiment of the present application is described below, and an electronic device described below and a video processing method and apparatus described above may be referred to mutually.
Referring to fig. 5, an embodiment of the present application discloses an electronic device, including:
a memory 501 for storing a computer program;
a processor 502 for executing the computer program to implement the method disclosed in any of the embodiments above.
Further, the embodiment of the application also provides a server serving as the electronic equipment. The server specifically may include: at least one processor, at least one memory, a power supply, a communication interface, an input-output interface, and a communication bus. The memory is used for storing a computer program, and the computer program is loaded and executed by the processor to implement relevant steps in the video processing method disclosed in any of the foregoing embodiments.
In this embodiment, the power supply is configured to provide a working voltage for each hardware device on the server; the communication interface can create a data transmission channel between the server and external equipment, and the communication protocol to be followed by the communication interface is any communication protocol applicable to the technical scheme of the application, and is not particularly limited herein; the input/output interface is used for acquiring external input data or outputting data to the external, and the specific interface type can be selected according to the specific application requirement, and is not limited in detail herein.
In addition, the memory may be a read-only memory, a random access memory, a magnetic disk, an optical disk, or the like as a carrier for storing resources, where the resources stored include an operating system, a computer program, data, and the like, and the storage mode may be transient storage or permanent storage.
The operating system is used for managing and controlling each hardware device and computer program on the Server to realize the operation and processing of the processor on the data in the memory, and the operation and processing can be Windows Server, netware, unix, linux and the like. The computer program may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the video processing method disclosed in any of the foregoing embodiments. The data may include data such as developer information of the virtual machine, in addition to data such as the virtual machine.
Further, the embodiment of the application also provides a terminal serving as the electronic equipment. The terminal may include, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, or the like.
Generally, the terminal in this embodiment includes: a processor and a memory.
The processor may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The processor may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor may also include a main processor, which is a processor for processing data in an awake state, also called a CPU (Central Processing Unit ), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor may incorporate a GPU (Graphics Processing Unit, image processor) for rendering and rendering of content required to be displayed by the display screen. In some embodiments, the processor may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.
The memory may include one or more computer-readable storage media, which may be non-transitory. The memory may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory is at least used to store a computer program, where the computer program, after being loaded and executed by the processor, can implement relevant steps in the video processing method performed by the terminal side as disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory can also comprise an operating system, data and the like, and the storage mode can be short-term storage or permanent storage. The operating system may include Windows, unix, linux, among others. The data may include, but is not limited to, update information for the application.
In some embodiments, the terminal may further include a display screen, an input-output interface, a communication interface, a sensor, a power supply, and a communication bus.
The following describes a readable storage medium provided in the embodiments of the present application, and the readable storage medium described below and the video processing method, apparatus and device described above may be referred to with each other.
A readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the video processing method disclosed in the foregoing embodiments.
The principles and embodiments of the present application are described herein with specific examples, the above examples being provided only to assist in understanding the methods of the present application and their core ideas; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (11)

1. A video processing method, comprising:
acquiring a video frame to be compressed;
dividing all columns in the video frame into a plurality of groups according to a preset sampling mode, and alternately storing each group into a first memory area and a second memory area;
starting from the first address of the first memory area, constructing data stored in a target address field in the first memory area into a first data block according to the preset sampling mode; simultaneously, starting from the first address of the second memory area, and constructing data stored in an object address field in the second memory area into a second data block according to the preset sampling mode;
Simultaneously compressing the first data block and the second data block;
wherein, the alternately storing each group into the first memory area and the second memory area includes:
arranging each group according to the column sequence numbers of all columns in the video frame to obtain a group sequence;
storing groups with odd arrangement positions in the group sequence into the first memory area, and storing groups with even arrangement positions in the group sequence into the second memory area;
wherein the compressing the first data block and the second data block simultaneously includes:
performing DCT transformation in compression operation on the first data block and the second data block simultaneously;
when the first data block and the second data block start DCT transformation, constructing the data stored in the next address segment in the first memory area into a new first data block according to the preset sampling mode, and constructing the data stored in the next address segment in the second memory area into a new second data block according to the preset sampling mode at the same time so as to simultaneously compress the new first data block and the new second data block;
wherein, two conversion modules and two compression modules are arranged; the two conversion modules respectively correspond to the first memory area and the second memory area; each conversion module corresponds to one compression module, and each compression module comprises two DCT units; setting 32Y component cache queues: y_fifo_0_A to y_fifo_15_a, y_fifo_0_b to y_fifo_15_b; 32U component cache queues: U_FIFO_ 0_A-U_FIFO_15_A, U_FIFO_0_B-U_FIFO_15_B; 32V component cache queues: V_FIFO_ 0_A-V_FIFO_15_A, V_FIFO_0_B-V_FIFO_15_B;
Under different sampling modes, the two data blocks are synchronously constructed by each conversion module by adopting 32Y component cache queues, 32U component cache queues and 32V component cache queues, and the adjacent data blocks constructed by each conversion module have repetition in compression time.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the preset sampling mode is as follows: YUV422 mode or YUV420 mode;
accordingly, the dividing all columns in the video frame into a plurality of groups according to a preset sampling mode includes:
and dividing all columns in the video frame into a plurality of groups with the number of columns being 16 according to the YUV422 mode or the YUV420 mode.
3. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the preset sampling mode is as follows: YUV444 mode;
accordingly, the dividing all columns in the video frame into a plurality of groups according to a preset sampling mode includes:
and dividing all columns in the video frame into a plurality of groups with the number of columns being 8 according to the YUV444 mode.
4. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the preset sampling mode is as follows: YUV422 mode or YUV420 mode;
Correspondingly, the constructing the data stored in the target address segment in the first memory area into the first data block according to the preset sampling mode includes:
according to the YUV422 mode or the YUV420 mode, respectively reading Y components, U components and V components of each pixel point stored in the target address field to corresponding cache queues, and constructing the first data block based on each cache queue;
correspondingly, the constructing the data stored in the object address segment in the second memory area into a second data block according to the preset sampling mode includes:
and respectively reading Y components, U components and V components of each pixel point stored in the object address segment to corresponding cache queues according to the YUV422 mode or the YUV420 mode, and constructing the second data block based on each cache queue.
5. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the preset sampling mode is as follows: YUV444 mode;
correspondingly, the constructing the data stored in the target address segment in the first memory area into the first data block according to the preset sampling mode includes:
according to the YUV444 mode, respectively reading Y components, U components and V components of each pixel point stored in the target address section to corresponding cache queues, and constructing the first data block based on each cache queue;
Correspondingly, the constructing the data stored in the object address segment in the second memory area into a second data block according to the preset sampling mode includes:
and according to the YUV444 mode, reading Y components, U components and V components of each pixel point stored in the object address segment to corresponding cache queues, and constructing the second data block based on each cache queue.
6. The method of claim 4 or 5, wherein after the simultaneous compression of the first data block and the second data block, further comprising:
and in the corresponding cache queues, the Y component, the U component and the V component of the first data block and the second data block are stored in an address space where the Y component, the U component and the V component of the subsequent new data block are located.
7. The method according to any one of claims 1 to 5, wherein, starting from a first address of the first memory area, data stored in a target address segment in the first memory area is constructed as a first data block according to the preset sampling pattern; simultaneously, starting from the first address of the second memory area, before constructing the data stored in the object address field in the second memory area into a second data block according to the preset sampling mode, the method further comprises:
And if the video frame is in the RGB format, converting the RGB format data stored in the first memory area and the second memory area into the YUV format.
8. The method according to any one of claims 1 to 5, further comprising:
after the first data block and the second data block are compressed to obtain compressed data, adding a frame identifier to the compressed data, and writing the compressed data added with the frame identifier into a preset memory area.
9. A video processing apparatus, comprising:
the acquisition module is used for acquiring the video frames to be compressed;
the storage module is used for dividing all columns in the video frame into a plurality of groups according to a preset sampling mode, and alternately storing each group into a first memory area and a second memory area;
the data block construction module is used for constructing data stored in a target address section in the first memory area into a first data block according to the preset sampling mode from the first address of the first memory area; simultaneously, starting from the first address of the second memory area, and constructing data stored in an object address field in the second memory area into a second data block according to the preset sampling mode;
The compression module is used for simultaneously compressing the first data block and the second data block;
the storage module is specifically configured to:
arranging each group according to the column sequence numbers of all columns in the video frame to obtain a group sequence;
storing groups with odd arrangement positions in the group sequence into the first memory area, and storing groups with even arrangement positions in the group sequence into the second memory area;
wherein the compression module comprises:
a first compression unit for simultaneously performing DCT transformation in a compression operation on the first data block and the second data block;
the second compression unit is used for constructing the data stored in the next address section in the first memory area into a new first data block according to the preset sampling mode when the first data block and the second data block start DCT conversion, and constructing the data stored in the next address section in the second memory area into a new second data block according to the preset sampling mode at the same time so as to simultaneously compress the new first data block and the new second data block;
wherein, two conversion modules and two compression modules are arranged; the two conversion modules respectively correspond to the first memory area and the second memory area; each conversion module corresponds to one compression module, and each compression module comprises two DCT units; setting 32Y component cache queues: y_fifo_0_A to y_fifo_15_a, y_fifo_0_b to y_fifo_15_b; 32U component cache queues: U_FIFO_ 0_A-U_FIFO_15_A, U_FIFO_0_B-U_FIFO_15_B; 32V component cache queues: V_FIFO_ 0_A-V_FIFO_15_A, V_FIFO_0_B-V_FIFO_15_B;
Under different sampling modes, the two data blocks are synchronously constructed by each conversion module by adopting 32Y component cache queues, 32U component cache queues and 32V component cache queues, and the adjacent data blocks constructed by each conversion module have repetition in compression time.
10. An electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the method of any one of claims 1 to 8.
11. A readable storage medium for storing a computer program, wherein the computer program when executed by a processor implements the method of any one of claims 1 to 8.
CN202211437518.2A 2022-11-17 2022-11-17 Video processing method, device, equipment and readable storage medium Active CN115499667B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211437518.2A CN115499667B (en) 2022-11-17 2022-11-17 Video processing method, device, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211437518.2A CN115499667B (en) 2022-11-17 2022-11-17 Video processing method, device, equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN115499667A CN115499667A (en) 2022-12-20
CN115499667B true CN115499667B (en) 2023-07-14

Family

ID=85115943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211437518.2A Active CN115499667B (en) 2022-11-17 2022-11-17 Video processing method, device, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN115499667B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101188761A (en) * 2007-11-30 2008-05-28 上海广电(集团)有限公司中央研究院 Method for optimizing DCT quick algorithm based on parallel processing in AVS
CN115086668A (en) * 2022-07-21 2022-09-20 苏州浪潮智能科技有限公司 Video compression method, system, equipment and computer readable storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1126411A (en) * 1995-01-06 1996-07-10 大宇电子株式会社 Apparatus for parallel encoding/decoding of digital video signals
US9749661B2 (en) * 2012-01-18 2017-08-29 Qualcomm Incorporated Sub-streams for wavefront parallel processing in video coding
CN104349168A (en) * 2014-08-11 2015-02-11 大连戴姆科技有限公司 Ultra-high-speed image real-time compression method
CN113709489B (en) * 2021-07-26 2024-04-19 山东云海国创云计算装备产业创新中心有限公司 Video compression method, device, equipment and readable storage medium
CN114501024B (en) * 2022-04-02 2022-07-19 苏州浪潮智能科技有限公司 Video compression system, method, computer readable storage medium and server
CN115243047A (en) * 2022-07-22 2022-10-25 山东云海国创云计算装备产业创新中心有限公司 Video compression method, device, equipment and medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101188761A (en) * 2007-11-30 2008-05-28 上海广电(集团)有限公司中央研究院 Method for optimizing DCT quick algorithm based on parallel processing in AVS
CN115086668A (en) * 2022-07-21 2022-09-20 苏州浪潮智能科技有限公司 Video compression method, system, equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN115499667A (en) 2022-12-20

Similar Documents

Publication Publication Date Title
US11593594B2 (en) Data processing method and apparatus for convolutional neural network
US11269529B2 (en) Neural network data processing apparatus, method and electronic device
CN111562948B (en) System and method for realizing parallelization of serial tasks in real-time image processing system
CN112235579B (en) Video processing method, computer-readable storage medium and electronic device
CN113709489B (en) Video compression method, device, equipment and readable storage medium
CN114501024B (en) Video compression system, method, computer readable storage medium and server
CN115460414B (en) Video compression method and system of baseboard management control chip and related components
CN112188280B (en) Image processing method, device and system and computer readable medium
WO2024074012A1 (en) Video transmission control method, apparatus and device, and nonvolatile readable storage medium
CN115209145A (en) Video compression method, system, device and readable storage medium
US20200128264A1 (en) Image processing
CN113286174B (en) Video frame extraction method and device, electronic equipment and computer readable storage medium
CN113573072B (en) Image processing method and device and related components
CN114428595A (en) Image processing method, image processing device, computer equipment and storage medium
CN115499667B (en) Video processing method, device, equipment and readable storage medium
CN110413540B (en) Method, system, equipment and storage medium for FPGA data caching
CN116166185A (en) Caching method, image transmission method, electronic device and storage medium
CN101499245B (en) Asynchronous first-in first-out memory, liquid crystal display controller and its control method
WO2021237513A1 (en) Data compression storage system and method, processor, and computer storage medium
CN107241601B (en) Image data transmission method, device and terminal
CN116795442B (en) Register configuration method, DMA controller and graphics processing system
CN113126869B (en) Method and system for realizing KVM image high-speed redirection based on domestic BMC chip
CN114554126B (en) Baseboard management control chip, video data transmission method and server
WO2024037251A1 (en) Data transmission method, apparatus and system, device, and storage medium
CN117979059A (en) Video processing method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant