CN112437303A - JPEG decoding method and device - Google Patents

JPEG decoding method and device Download PDF

Info

Publication number
CN112437303A
CN112437303A CN202011263958.1A CN202011263958A CN112437303A CN 112437303 A CN112437303 A CN 112437303A CN 202011263958 A CN202011263958 A CN 202011263958A CN 112437303 A CN112437303 A CN 112437303A
Authority
CN
China
Prior art keywords
pictures
decoded data
data
reading
ddr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011263958.1A
Other languages
Chinese (zh)
Inventor
张云哲
耿嘉
樊平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shenwei Technology Co ltd
Original Assignee
Beijing Shenwei Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shenwei Technology Co ltd filed Critical Beijing Shenwei Technology Co ltd
Priority to CN202011263958.1A priority Critical patent/CN112437303A/en
Publication of CN112437303A publication Critical patent/CN112437303A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/15Data rate or code amount at the encoder output by monitoring actual compressed data size at the memory before deciding storage at the transmission buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/174Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/423Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/436Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]

Abstract

The invention discloses a JPEG decoding method and a JPEG decoding device, which are applied to the field of image processing.A Field Programmable Gate Array (FPGA) accelerator card reads M pictures from a Double Data Rate (DDR) of the FPGA accelerator card, wherein M is an integer greater than 1; the FPGA accelerator card correspondingly distributes the M pictures read from the DDR to M JPEG decoders, and the M JPEG decoders share the same GMEM resources on the FPGA accelerator card to carry out M-path parallel JPEG decoding on the M pictures to obtain M-path decoding data streams; reading and converging M paths of decoding data streams by the FPGA accelerator card to obtain converged decoding data; and outputting the merged decoded data to the DDR. The invention improves the JPEG decoding efficiency.

Description

JPEG decoding method and device
Technical Field
The invention belongs to the field of image processing, and particularly relates to a JPEG decoding method and device.
Background
Joint Photographic Experts Group (JPEG) is a lossy compression standard method widely used for Photographic images. JPEG itself only describes how to convert a picture into a data stream of bytes (streaming). An additional standard, called JFIF (JPEG File exchange Format, JPEG File Interchange Format, joint photographic experts group File Interchange Format), created by C-Cube Microsystems, etc., specifies how to produce a File suitable for storage and transmission by a computer from a JPEG stream.
With the increasing capacity of the accelerator card, when JPEG decoding is performed on an FPGA (Field Programmable Gate Array), more cores may be placed on the FPGA accelerator card in order to maximize throughput of JPEG decoding. However, each kernel occupies a certain amount of GMEM resources, and GMEM (cache) resources on each accelerator card are limited, so that more kernels are limited to be placed, and the JPEG decoding efficiency of the FPGA is limited.
Disclosure of Invention
In view of the above problems in the prior art, embodiments of the present invention provide a JPEG decoding method and apparatus, so as to improve JPEG decoding efficiency.
In a first aspect, an embodiment of the present invention provides a JPEG decoding method, which is applied to an FPGA accelerator card, and the method includes:
reading M pictures from a DDR of the FPGA accelerator card, wherein M is an integer larger than 1;
correspondingly distributing the M pictures read from the DDR to M JPEG decoders, wherein the M JPEG decoders share the same GMEM resource on the FPGA accelerator card to carry out M-path parallel JPEG decoding on the M pictures to obtain M-path decoded data streams;
reading and converging the M paths of decoding data streams to obtain converged decoding data;
and outputting the merged decoded data to the DDR.
Optionally, the reading of the data fragments of the M pictures from the DDR of the FPGA accelerator card includes:
and sequentially reading the data segments of the M pictures from the DDR based on a preset reading sequence and the size of the data segments until the data of the M pictures are completely read.
Optionally, the reading the data segments of the M pictures from the DDR in sequence based on a preset reading sequence and a data segment size includes:
judging a first overall state aiming at the M pictures, wherein the first overall state represents whether the M pictures are read completely;
if the first overall state is negative, polling to check whether the data of each picture in the M pictures has data, if the data exists in the current check picture, reading in the next data fragment from the current check picture, and updating the data state of the current check picture;
and updating the first overall state according to the data state of each picture in the M pictures, and returning to the step of judging the first overall state aiming at the M pictures.
Optionally, the reading and merging the M decoded data streams to obtain merged decoded data includes:
reading the decoded data segments from the M decoded data streams in sequence based on a preset reading sequence;
and converging the decoded data fragments read from the M paths of decoded data streams to obtain converged decoded data.
Optionally, the reading out the decoded data segment from the M decoded data streams based on a preset reading order includes:
judging a second overall state of the M paths of decoded data streams, wherein the second overall state represents whether the M paths of decoded data streams are all empty;
if not, judging whether the ith decoding data stream in the M decoding data streams is empty, if not, reading out the next decoded data segment from the ith decoding data stream, and sequentially taking 1 to M as i;
and updating the second overall state according to the state of each decoded data stream in the M decoded data streams, and returning to the step of judging the second overall state of the M decoded data streams.
In a second aspect, an embodiment of the present invention provides a JPEG decoding apparatus, which is applied to an FPGA accelerator card, and the apparatus includes:
the input unit is used for reading M pictures from a DDR of the FPGA accelerator card, wherein M is an integer larger than 1;
the decoding unit is used for correspondingly distributing the M pictures read from the DDR to M JPEG decoders, and the M JPEG decoders share the same GMEM resource on the FPGA accelerator card to carry out M-path parallel JPEG decoding on the M pictures to obtain M-path decoded data streams;
and the output unit is used for reading and converging the M paths of decoding data streams to obtain converged decoding data and outputting the converged decoding data to the DDR.
Optionally, the input unit is specifically configured to:
and sequentially reading the data segments of the M pictures from the DDR based on a preset reading sequence and the size of the data segments until the data of the M pictures are completely read.
Optionally, the input unit includes:
a first judging subunit, configured to judge a first overall state for the M pictures, where the first overall state represents whether all the M pictures have been read;
the polling subunit is used for polling and checking whether the data of each picture in the M pictures has data or not if the first overall state is negative, and reading in the next data fragment from the current check picture and updating the data state of the current check picture if the data exists in the current check picture;
and the first state updating subunit is used for updating the first overall state according to the data state of each of the M pictures and returning to the step of judging the first overall state of the M pictures.
Optionally, the output unit includes:
a reading subunit, configured to sequentially read the decoded data segments from the M decoded data streams based on a preset reading order;
and the merging subunit is used for merging the decoded data fragments read from the M paths of decoded data streams to obtain merged decoded data.
Optionally, the readout subunit includes:
a second judging subunit, configured to judge a second overall state of the M decoded data streams, where the second overall state represents whether the M decoded data streams are all empty;
a decoded data reading subunit, which finally judges whether the ith decoded data stream in the M decoded data streams is empty if the ith decoded data stream is not empty, and reads the next decoded data segment from the ith decoded data stream if the ith decoded data stream is not empty, wherein i sequentially takes 1 to M;
and a second state updating subunit, configured to update the second overall state according to the state of each decoded data stream in the M decoded data streams, and return to the step of determining the second overall state of the M decoded data streams.
In a third aspect, an embodiment of the present invention provides an electronic device, including a memory, a main processor, an FPGA accelerator card, a DDR cache on the FPGA accelerator card, and a computer program stored on the memory and operable on the FPGA accelerator card, where the DDR cache acquires M pictures from the main processor, where M is an integer greater than 1; and when the FPGA accelerator card executes the program, the steps of any one of the methods in the first aspect are realized for the M pictures.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of any of the methods of the first aspect.
One or more technical solutions provided by the embodiments of the present invention at least achieve the following technical effects or advantages:
according to the JPEG decoding method and device provided by the embodiment of the invention, the FPGA accelerator card correspondingly distributes M pictures read from a DDR to M JPEG decoders, the M JPEG decoders share the same GMEM resource on the FPGA accelerator card to carry out M-path parallel JPEG decoding on the M pictures, and M-path decoding data streams are correspondingly obtained; reading and converging M paths of decoding data streams to obtain converged decoding data; and outputting the merged decoded data to the DDR. Therefore, the multi-channel data share the same GMEM resource, multi-channel parallel of JPEG file level is realized, and the JPEG decoding efficiency on the FPGA accelerator card is further improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts. In the drawings:
FIG. 1 shows a flow chart of a JPEG decoding method in an embodiment of the invention;
FIG. 2 is a data flow diagram illustrating a JPEG decoding method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram showing the structure of a JPEG decoding apparatus in the embodiment of the present invention.
Detailed Description
In order to solve the problem that the prior art limits the JPEG decoding efficiency of an FPGA, the embodiment of the invention provides a JPEG decoding method and a JPEG decoding device, and the general idea is as follows:
the FPGA accelerator card performs M-path parallel JPEG decoding on M pictures read from a DDR SDRAM (Double Data Rate Synchronous Dynamic Random Access Memory) through M JPEG decoders sharing the same GMEM resource on the FPGA accelerator card, and M-path decoded Data streams are correspondingly obtained; and merging the M paths of decoded data streams and outputting the decoded data streams to the DDR. Therefore, the multi-channel data share the same GMEM resource, multi-channel parallelism of JPEG file level is realized, and the JPEG decoding efficiency of the FPGA is further improved.
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The term "and/or" appearing herein is merely one type of associative relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship; the word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
First embodiment
The embodiment of the invention provides a JPEG decoding method which is applied to a JPEG accelerator card and realizes high-efficiency JPEG decoding based on an FPGA accelerator card.
Referring to fig. 1 and fig. 2, a JPEG decoding method according to an embodiment of the present invention is described in detail below:
first, step S101 is performed: reading M pictures from a DDR of the FPGA accelerator card, wherein M is an integer larger than 1.
In a specific implementation, step S101 may be performed by reading the entire picture at a time, and then performing step S102 after all M pictures are read. In order to start the decoding process for the M pictures as soon as possible to improve the decoding efficiency, step S101 may also read the data segments of the M pictures from the DDR in sequence based on the preset reading sequence and the fixed data segment size instead of reading the entire picture at one time until all the data of the M pictures are read. For each picture in the M pictures, after a data segment of the picture is read in from the DDR, the JPEG decoding process of the picture is triggered to start, and the decoding does not need to be started after the whole picture is read out, so that the decoding efficiency is improved.
Specifically, reading data segments of M pictures from the DDR sequentially based on a preset reading sequence, and the implementation process is as follows: firstly, sequentially reading a first data fragment on each of M pictures; then, sequentially reading a second data fragment on each of the M pictures; and then, reading a third data segment of the M pictures, … …, and proceeding according to the rule until the last data segment of each of the M pictures is read, finishing reading all data of the M pictures, and ending the process of reading the data segments from the DRR.
For example, the detailed reading sequence may be: sequentially reading in a data segment 1 of a 1 st picture, reading in a data segment 1 of a 2 nd picture, reading in a data segment 1 of a 3 rd picture, and … … until after the data segment 1 of an M th picture is read; and sequentially reading the data segment 2 of the 1 st picture, the data segment 2 of the 2 nd picture and the data segment 2 of the 3 rd picture, … …, and continuing the process according to the rule until the last data segment of the M picture is read, and finishing the process of reading the data segments from the DDR.
Since the multiple decoders share the GMEM resource, in order to ensure the reliability of the process of reading the data segments from the DDR, the step of sequentially reading the data segments of the M pictures from the DDR based on the preset reading order and the size of the data segments specifically includes the following steps S1011 to S1013:
step S1011, determining a first overall state for the M pictures, where the first overall state represents whether all the M pictures have been read.
Step S1012, if the first overall state is no, polling to check whether data exists in the data of each of the M pictures, and if data exists in the currently checked picture, reading in the next data segment from the currently checked picture, and updating the data state of the currently checked picture.
Step S1012, after polling M pictures, updating the first overall status according to the data status of each of the M pictures, and returning to perform step S1011.
And the first overall state is used for representing whether the M pictures are completely read or not. Specifically, the first overall state is calculated according to the data state of each of the M pictures, and the data state of each picture represents whether the picture has been read.
In order to fully understand the process of reading M pictures from the DDR, taking picture a, picture B, and picture C as an example, the process of reading M pictures from the DDR is illustrated with reference to fig. 2 and the following steps 1 to 5:
step 1: acquiring a first overall state aiming at the picture A, the picture B and the picture C, judging whether the picture A, the picture B and the picture C have no data according to the first overall state, and if so, ending the flow of reading in data fragments from the DDR; otherwise, step 2 is executed.
Step 2: reading the data state of the picture A, and judging whether the data in the picture A is completely read according to the data state of the picture A; if yes, directly entering the step 3; otherwise, reading the next data segment from the picture A, updating the data state of the picture A and then executing the step 3;
and step 3: reading the data state of the picture B, and judging whether the data in the picture B is completely read according to the data state of the picture B; if yes, directly entering the step 4; otherwise, reading the next data segment from the picture B, and executing the step 4 after updating the data state of the picture B;
and 4, step 4: reading the data state of the picture C, judging whether the data in the picture C is completely read according to the data state of the picture C, and if so, directly entering the step 5; otherwise, reading the next data segment from the picture C, and executing the step 5 after updating the data state of the picture C;
and 5: and calculating a first overall state according to the current data state of each picture in the pictures A, B and C, and returning to the step 1.
And (3) by circulating the steps 1-5 until all the data on the picture A, the picture B and the picture C are read completely, finishing the process of reading the data segments from the DDR, thereby continuously and sequentially reading the data segments from each picture, meeting the requirement of data inflow when each JPEG decoder decodes, and ensuring the smooth operation of multi-path parallel JPEG decoding.
S102: and correspondingly distributing the M pictures read from the DDR to M JPEG decoders, wherein the M JPEG decoders share the same GMEM resource on the FPGA accelerator card to carry out M-path parallel JPEG decoding on the M pictures to obtain M-path decoded data streams.
Specifically, if the entire picture is not read in one time for each picture in step S101, the corresponding JPEG decoder is triggered to start JPEG decoding on the picture once the first data segment of the picture is read in. It should be noted that M-path JPEG decoding is to perform JPEG of the corresponding picture at the same time, and each path of JPEG decoding process specifically includes: the current data segment read from the corresponding picture is decompressed, and then the decompression result is inversely quantized; and finally, performing inverse discrete cosine transform on the inverse quantization result to obtain decoded data judgment corresponding to the data segment. And then, decoding the next data segment, and forming a path of decoded data stream corresponding to the picture.
S103, reading and converging the M paths of decoded data streams to obtain converged decoded data.
In a specific embodiment, step S103 specifically includes: reading the decoded data segments from the M paths of decoded data streams in sequence based on a preset reading sequence; and converging the decoded data fragments read from the M paths of decoded data streams to obtain converged decoded data.
In the specific implementation, in order to smoothly perform the flow of reading the decoded data segment, in the implementation process of reading the decoded data segment from the M decoded data streams, for each decoded data stream, before reading data from the decoded data stream, it is necessary to determine whether the decoded data stream is empty, and if not, the decoded data segment is read from the decoded data stream, so that the flow can be prevented from being blocked when no data exists in the decoded data stream.
Specifically, reading out the decoded data fragments from the M decoded data streams is performed by: judging a second overall state of the M paths of decoded data streams, wherein the second overall state represents whether the M paths of decoded data streams are all empty; if the second overall state represents that the M decoded data streams are all empty, the process of reading the decoded data segments from the M decoded data streams is finished, if the second overall state represents that the M decoded data streams are not all empty, whether the ith decoded data stream in the M decoded data streams is empty is judged, if the ith decoded data stream is not empty, the next decoded data segment is read from the ith decoded data stream, and i is sequentially 1 to M; and updating the second overall state according to the state of each decoded data stream in the M decoded data streams, and returning to the step of judging the second overall state of the M decoded data streams.
Specifically, the second overall state is calculated according to the state of each decoded data stream in the M decoded data streams, and the state of each decoded data stream represents whether the decoded data stream is empty or not.
In order to fully understand the process of reading out the decoded data segments from the M decoded data streams, taking only picture a, picture B, and picture C as an example, the process of reading out the decoded data segments from the M decoded data streams is exemplified with reference to fig. 2 and the following steps 11 to 15:
step 11: and acquiring a second overall state aiming at the M paths of decoded data streams, judging whether the M paths of decoded data streams are all empty according to the second overall state, if so, ending the process of reading out the decoded data segments from the M paths of decoded data streams, otherwise, executing the step 12.
Step 12: reading the state of a decoded data stream1 corresponding to the picture A, judging whether the state of the stream1 is empty, and if so, directly entering the step 13; otherwise, step 13 is performed after the next decoded data segment is read from stream1 and the state of stream1 is updated.
Step 13: reading the state of the decoded data stream2 corresponding to the picture B, judging whether the state of the stream2 is empty, and if yes, directly entering the step 14; otherwise, the next decoded data segment is read from stream2 and step 14 is performed after updating the state of stream 2.
Step 14: reading the state of the decoded data stream3 corresponding to the picture C, judging whether the state of the stream3 is empty, and if yes, directly entering the step 15; otherwise, the next decoded data segment is read from stream3 and step 15 is performed after updating the state of stream 3.
Step 15: the second overall state is calculated from the state of whether each of the 3 decoded data streams, stream1 to stream3, is empty or not, and the process returns to step 11.
By circulating the steps 11-15, the decoded data segments can be continuously read from each decoded data stream, and the situation that the decoded data stream blocks the whole reading process when the decoded data stream is empty is avoided.
And S104, outputting the merged decoded data to the DDR.
In a second aspect, based on the same inventive concept, an embodiment of the present invention provides a JPEG decoding apparatus applied to an FPGA accelerator card, and referring to fig. 3, the JPEG decoding apparatus includes:
the input unit 301 is configured to read M pictures from a DDR of the FPGA accelerator card, where M is an integer greater than 1;
the decoding unit 302 is configured to correspondingly allocate the M pictures read from the DDR to M JPEG decoders, where the M JPEG decoders share the same GMEM resource on the FPGA accelerator card to perform M-way parallel JPEG decoding on the M pictures, so as to obtain M-way decoded data streams;
and the output unit 303 is configured to read and merge the M decoded data streams to obtain merged decoded data, and output the merged decoded data to the DDR.
In an optional implementation, the input unit 301 is specifically configured to:
and sequentially reading the data segments of the M pictures from the DDR based on the preset reading sequence and the size of the data segments until the data of the M pictures are completely read.
In an alternative embodiment, the input unit 301 includes:
the first judging subunit is used for judging a first overall state aiming at the M pictures, and the first overall state represents whether the M pictures are read completely;
the polling subunit is used for polling and checking whether the data of each picture in the M pictures has data or not if the first overall state is negative, and reading in the next data fragment from the current checking picture and updating the data state of the current checking picture if the data exists in the current checking picture;
and the first state updating subunit is used for updating the first overall state according to the data state of each of the M pictures and returning to the step of judging the first overall state of the M pictures.
In an alternative embodiment, the output unit 303 includes:
a reading subunit, configured to sequentially read the decoded data segments from the M decoded data streams based on a preset reading order;
and the merging subunit is used for merging the decoded data fragments read from the M paths of decoded data streams to obtain merged decoded data.
In an alternative embodiment, the readout subunit includes:
a second judging subunit, configured to judge a second overall state of the M decoded data streams, where the second overall state represents whether the M decoded data streams are all empty;
a decoded data reading subunit, which judges whether the ith decoded data stream in the M decoded data streams is empty or not if no, reads the next decoded data segment from the ith decoded data stream if no, and sequentially takes 1 to M;
and the second state updating subunit is used for updating the second overall state according to the state of each decoded data stream in the M decoded data streams, and returning to the step of judging the second overall state of the M decoded data streams.
The details of the implementation of each functional unit in the above apparatus have been described in detail in the foregoing embodiment of the JPEG decoding method, and for the sake of brevity of the description, it may refer to the description in the foregoing embodiment of the method.
In a third aspect, based on the same inventive concept as the foregoing JPEG decoding method embodiment, the present invention further provides an electronic device, including a memory, a main processor, an FPGA accelerator card, a DDR cache on the FPGA accelerator card, and a computer program stored on the memory and operable on the FPGA accelerator card, where the DDR cache acquires M pictures from the main processor, where M is an integer greater than 1; and when the FPGA accelerator card executes the program, the steps of any embodiment mode in the JPEG decoding method embodiment are realized for the M pictures.
In a fourth aspect, based on the inventive concept of the foregoing embodiment of the JPEG decoding method, the present invention further provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of any of the foregoing embodiments of the JPEG decoding method.
According to the JPEG decoding method and device provided by the embodiment of the invention, the FPGA accelerator card correspondingly distributes M pictures read from a DDR to M JPEG decoders, and the M JPEG decoders share the same GMEM resource on the FPGA accelerator card to carry out M-path parallel JPEG decoding on the M pictures to obtain M-path decoding data streams; reading and converging M paths of decoding data streams to obtain converged decoding data; and outputting the merged decoded data to the DDR. Therefore, the multi-channel data share the same GMEM resource, multi-channel parallel of JPEG file level is realized, and the JPEG decoding efficiency on the FPGA accelerator card is further improved.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A JPEG decoding method is applied to an FPGA acceleration card and is characterized by comprising the following steps:
reading M pictures from a DDR of the FPGA accelerator card, wherein M is an integer larger than 1;
correspondingly distributing the M pictures read from the DDR to M JPEG decoders, wherein the M JPEG decoders share the same GMEM resource on the FPGA accelerator card to carry out M-path parallel JPEG decoding on the M pictures to obtain M-path decoded data streams;
reading and converging the M paths of decoding data streams to obtain converged decoding data;
and outputting the merged decoded data to the DDR.
2. The method of claim 1, wherein reading the data fragments of the M pictures from the DDR of the FPGA accelerator card comprises:
and sequentially reading the data segments of the M pictures from the DDR based on a preset reading sequence and the size of the data segments until the data of the M pictures are completely read.
3. The method as claimed in claim 2, wherein the reading the data segments of the M pictures from the DDR sequentially based on the preset reading order and the data segment size comprises:
judging a first overall state aiming at the M pictures, wherein the first overall state represents whether the M pictures are read completely;
if the first overall state is negative, polling to check whether the data of each picture in the M pictures has data, if the data exists in the current check picture, reading in the next data fragment from the current check picture, and updating the data state of the current check picture;
and updating the first overall state according to the data state of each picture in the M pictures, and returning to the step of judging the first overall state aiming at the M pictures.
4. The method of claim 1, wherein said reading and merging said M decoded data streams to obtain merged decoded data comprises:
reading the decoded data segments from the M decoded data streams in sequence based on a preset reading sequence;
and converging the decoded data fragments read from the M paths of decoded data streams to obtain converged decoded data.
5. The method of claim 4, wherein reading out the decoded data fragments from the M decoded data streams based on a preset reading-out order comprises:
judging a second overall state of the M paths of decoded data streams, wherein the second overall state represents whether the M paths of decoded data streams are all empty;
if not, judging whether the ith decoding data stream in the M decoding data streams is empty, if not, reading out the next decoded data segment from the ith decoding data stream, and sequentially taking 1 to M as i;
and updating the second overall state according to the state of each decoded data stream in the M decoded data streams, and returning to the step of judging the second overall state of the M decoded data streams.
6. A JPEG decoding device is applied to an FPGA accelerator card and is characterized by comprising:
the input unit is used for reading M pictures from a DDR of the FPGA accelerator card, wherein M is an integer larger than 1;
the decoding unit is used for correspondingly distributing the M pictures read from the DDR to M JPEG decoders, and the M JPEG decoders share the same GMEM resource on the FPGA accelerator card to carry out M-path parallel JPEG decoding on the M pictures to obtain M-path decoded data streams;
and the output unit is used for reading and converging the M paths of decoding data streams to obtain converged decoding data and outputting the converged decoding data to the DDR.
7. The apparatus of claim 6, wherein the input unit is specifically configured to:
and sequentially reading the data segments of the M pictures from the DDR based on a preset reading sequence and the size of the data segments until the data of the M pictures are completely read.
8. The apparatus of claim 6, wherein the output unit comprises:
a reading subunit, configured to sequentially read the decoded data segments from the M decoded data streams based on a preset reading order;
and the merging subunit is used for merging the decoded data fragments read from the M paths of decoded data streams to obtain merged decoded data.
9. An electronic device comprises a memory, a main processor, an FPGA (field programmable gate array) accelerator card, a DDR (double data rate) cache on the FPGA accelerator card and a computer program which is stored on the memory and can run on the FPGA accelerator card, wherein the DDR cache acquires M pictures from the main processor, and M is an integer greater than 1;
when the FPGA accelerator card executes the program, the steps of the method of any one of claims 1 to 5 are implemented for the M pictures.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.
CN202011263958.1A 2020-11-12 2020-11-12 JPEG decoding method and device Pending CN112437303A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011263958.1A CN112437303A (en) 2020-11-12 2020-11-12 JPEG decoding method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011263958.1A CN112437303A (en) 2020-11-12 2020-11-12 JPEG decoding method and device

Publications (1)

Publication Number Publication Date
CN112437303A true CN112437303A (en) 2021-03-02

Family

ID=74699979

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011263958.1A Pending CN112437303A (en) 2020-11-12 2020-11-12 JPEG decoding method and device

Country Status (1)

Country Link
CN (1) CN112437303A (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0870457A (en) * 1994-08-29 1996-03-12 Graphics Commun Lab:Kk Image decoding device by parallel processing
CN1822656A (en) * 2006-01-13 2006-08-23 深圳创维-Rgb电子有限公司 Method for displaying high resolution JPEG picture using embedded Linux system TV set
CN1878307A (en) * 2006-07-14 2006-12-13 杭州国芯科技有限公司 Method for improving SDRAM bus efficiency in video decoder
CN101518091A (en) * 2006-09-26 2009-08-26 松下电器产业株式会社 Decoding device, decoding method, decoding program, and integrated circuit
CN102404578A (en) * 2011-12-21 2012-04-04 中国科学院自动化研究所 Multi-channel video transmitting system and method
CN103338368A (en) * 2013-05-15 2013-10-02 武汉精测电子技术股份有限公司 FPGA (field programmable gate array)-based JPEG (joint photographic experts group) parallel decoding device and decoding method
CN103841359A (en) * 2012-11-23 2014-06-04 中兴通讯股份有限公司 Video multi-image synthesizing method, device and system
CN105828083A (en) * 2015-01-06 2016-08-03 中兴通讯股份有限公司 Method and device for decoding data streams
US9542760B1 (en) * 2014-12-18 2017-01-10 Amazon Technologies, Inc. Parallel decoding JPEG images
CN108769684A (en) * 2018-06-06 2018-11-06 郑州云海信息技术有限公司 Image processing method based on WebP image compression algorithms and device
CN110446046A (en) * 2019-08-19 2019-11-12 杭州图谱光电科技有限公司 A kind of batch images fast decoding method based on embedded platform
WO2020057182A1 (en) * 2018-09-19 2020-03-26 华为技术有限公司 Image compression method and apparatus

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0870457A (en) * 1994-08-29 1996-03-12 Graphics Commun Lab:Kk Image decoding device by parallel processing
CN1822656A (en) * 2006-01-13 2006-08-23 深圳创维-Rgb电子有限公司 Method for displaying high resolution JPEG picture using embedded Linux system TV set
CN1878307A (en) * 2006-07-14 2006-12-13 杭州国芯科技有限公司 Method for improving SDRAM bus efficiency in video decoder
CN101518091A (en) * 2006-09-26 2009-08-26 松下电器产业株式会社 Decoding device, decoding method, decoding program, and integrated circuit
CN102404578A (en) * 2011-12-21 2012-04-04 中国科学院自动化研究所 Multi-channel video transmitting system and method
CN103841359A (en) * 2012-11-23 2014-06-04 中兴通讯股份有限公司 Video multi-image synthesizing method, device and system
CN103338368A (en) * 2013-05-15 2013-10-02 武汉精测电子技术股份有限公司 FPGA (field programmable gate array)-based JPEG (joint photographic experts group) parallel decoding device and decoding method
US9542760B1 (en) * 2014-12-18 2017-01-10 Amazon Technologies, Inc. Parallel decoding JPEG images
CN105828083A (en) * 2015-01-06 2016-08-03 中兴通讯股份有限公司 Method and device for decoding data streams
CN108769684A (en) * 2018-06-06 2018-11-06 郑州云海信息技术有限公司 Image processing method based on WebP image compression algorithms and device
WO2020057182A1 (en) * 2018-09-19 2020-03-26 华为技术有限公司 Image compression method and apparatus
CN110446046A (en) * 2019-08-19 2019-11-12 杭州图谱光电科技有限公司 A kind of batch images fast decoding method based on embedded platform

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
周艳娥;葛海波;林界;: "Huffman并行解码算法的改进与实现", 微型机与应用, no. 11, 10 June 2013 (2013-06-10) *
杨扬;邓家先;刘文进;: "基于FPGA图像分块解码的系统设计", 通信技术, no. 03, 10 March 2011 (2011-03-10) *

Similar Documents

Publication Publication Date Title
CN109194960B (en) Image frame rendering method and device and electronic equipment
KR101034080B1 (en) Uniform video decoding and display
US8286151B2 (en) Overlay instruction accessing unit and overlay instruction accessing method
KR101477434B1 (en) Out-of-order command execution in a multimedia processor
US8532196B2 (en) Decoding device, recording medium, and decoding method for coded data
US20130121421A1 (en) Video decoder and method of decoding a sequence of pictures
US8634470B2 (en) Multimedia decoding method and multimedia decoding apparatus based on multi-core processor
CN113457160A (en) Data processing method and device, electronic equipment and computer readable storage medium
US10771792B2 (en) Encoding data arrays
CN113226501A (en) Streaming media image providing device and method for application program
CN113497955B (en) Video processing system
US6775757B1 (en) Multi-component processor
CN114466227A (en) Video analysis method and device, electronic equipment and storage medium
CN110446046B (en) Batch image fast decoding method based on embedded platform
US10440359B2 (en) Hybrid video encoder apparatus and methods
Sodsong et al. Dynamic partitioning-based JPEG decompression on heterogeneous multicore architectures
CN112437303A (en) JPEG decoding method and device
US6742083B1 (en) Method and apparatus for multi-part processing of program code by a single processor
US6738884B1 (en) Method and apparatus for processing data with semaphores
CN113923507B (en) Low-delay video rendering method and device for Android terminal
US6920543B1 (en) Method and apparatus for performing distributed processing of program code
WO2022141115A1 (en) Video processing method and apparatus, system on chip, and storage medium
CN109379591A (en) Picture code-transferring method, electronic device and computer readable storage medium
CN112437308A (en) WebP coding method and device
CN114071154A (en) Data decoding method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination