CN106028049B - A kind of two-dimensional dct image processor - Google Patents
A kind of two-dimensional dct image processor Download PDFInfo
- Publication number
- CN106028049B CN106028049B CN201610529240.XA CN201610529240A CN106028049B CN 106028049 B CN106028049 B CN 106028049B CN 201610529240 A CN201610529240 A CN 201610529240A CN 106028049 B CN106028049 B CN 106028049B
- Authority
- CN
- China
- Prior art keywords
- data
- dimensional dct
- shift register
- pattern
- control signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/625—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
- H04N19/423—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Discrete Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Complex Calculations (AREA)
Abstract
The invention belongs to technical field of integrated circuits, more particularly to a kind of two-dimensional dct image processor.The image processor of the present invention includes data selecting module, status control module, one-dimensional DCT modules and shift register array;The data input pin of data selecting module connects the output end of shift register array, and control signal input connects the output end of status control module, and output terminates the data input pin of one-dimensional DCT modules;The control signal input of the control signal input and data selecting module of the output termination shift register array of status control module;The data input pin of one-dimensional DCT modules connects the output end of data selecting module, and control signal input connects external control signal, the data input pin of output termination shift register array, and is also the data output end of whole image processor.Beneficial effects of the present invention are that can adjust the operating mode of two-dimensional dct image processor according to the accuracy requirement of different occasions, it is made to have lower energy consumption.
Description
Technical field
The invention belongs to technical field of integrated circuits, more particularly at the low energy consumption two-dimensional dct image of a kind of variable precision
Manage device.
Background technology
Discrete cosine transform (DCT) is a kind of technology very popular in digital processing field, have it is good just
Characteristic, stalling characteristic, decorrelation characteristic and energy compaction property are handed over, is the core algorithm of compression of images.Image information passes through two
Most energy concentrates in DC component and a small number of low frequency components after dimension dct transform, reduces spatial redundancy.JPEG
(JointPhotographic Experts Group, joint photographic experts group), MPEG (MovingPicture Expert
Group, Motion Picture Experts Group), H.263 etc. standards all use major programme of the DCT technologies as compressed encoding.
The combination that separation method of row and column is decomposed into two one-dimensional DCT circuits may be used in two-dimensional dct circuit, respectively into every trade meter
Calculation and column count, so the core of two-dimensional dct image processor is its internal one-dimensional DCT circuit.In traditional images processor
One-dimensional DCT circuit is made of 22 multipliers and 30 adders, since multiplier all compares compared with area for adder and power consumption
It is larger so that the area and power consumption of number of multipliers conventional two-dimensional DCT image processors on the high side are also bigger.
In different occasions, the requirement to picture quality is also different, needs image processor that can dynamically adjust
Picture quality is to adapt to different required precisions.Traditional two-dimensional dct image processor can only be provided at a kind of image of precision
Reason, is unsatisfactory for system requirements, the excessively high-precision image procossing scheme of use can cause energy using the image procossing scheme of low precision
The waste of amount.
For traditional images processor, improvements of the invention are that, reduce the use number of multiplier
Amount, to reduce the area and power consumption of image processor, while the design of adjustable accuracy makes the two-dimensional dct image processor
Different scene demands is can adapt to, lower energy consumption is possessed.
Invention content
It is to be solved by this invention, aiming at above-mentioned conventional two-dimensional DCT image processors there are the problem of, propose a kind of
The low energy consumption two-dimensional dct image processor of variable precision.
To achieve the above object, the present invention adopts the following technical scheme that:
A kind of low energy consumption two-dimensional dct image processor of variable precision, uses collapsible framework, and there are four types of work altogether
Pattern, respectively pattern 4, pattern 3, pattern 2 and pattern 1, precision and energy consumption slave pattern 4 are continuously decreased to pattern 1;At the image
It includes data selecting module, status control module, one-dimensional DCT modules and shift register array to manage device, wherein one-dimensional DCT moulds
Block includes the fixation bit wide multiplier of variable precision, adder, subtracter, threshold value judgment module and pipeline register;It is described
The data input pin of data selecting module connects the output end of outer input data and shift register array, control signal input
The output end of status control module is connect, the output of data selecting module terminates the data input pin of one-dimensional DCT modules;The state
The control signal input of control module connects external control signal, the control signal input of output termination shift register array
The control signal input at end and data selecting module;The data input pin of the one-dimensional DCT modules connects data selecting module
Output end, control signal input connect external control signal, the data of the output termination shift register array of one-dimensional DCT modules
Input terminal, and be also the data output end of whole image processor;
The status control module is made of a finite state machine, and state is respectively original state, row calculating state and
Column count state, and carry out state switching is changed according to external control signal and internal data;The control letter that the module is exported
Number determine the input data source of one-dimensional DCT modules and the moving direction of shift register array;
The data selecting module is input to the data of one-dimensional DCT modules according to control signal deciding, when row calculating state,
It controls signal behavior outer input data and enters one-dimensional DCT modules, when column count state, control signal behavior shift register battle array
The data of row output end enter one-dimensional DCT modules;
The one-dimensional DCT modules include the fixation bit wide multiplier of variable precision, adder, subtracter, threshold decision mould
Block and pipeline register reduce the complexity of hardware by the approximate multiplexing of multiplier results, finally by the wide multiplication of fixed bit
The number of device is reduced to 8;There are two types of the patterns of calculating, respectively high-precision for the fixation bit wide multiplier of variable precision in the design
Calculating pattern and low accuracy computation pattern, the image processor use high precision computation pattern, pattern 2 in pattern 4 and pattern 3
With when pattern 1 use low accuracy computation pattern;Outer input data, which inputs after first order addition and subtraction into threshold value, to be sentenced
Disconnected module, the threshold value judgment module are only more than threshold value just according to current operation mode decision threshold size, the data of input
It can be sent into pipeline register, being input into fixed bit wide multiplier in next clock continues to calculate;Fixed bit wide multiplier
As a result final result of calculation, the calculative DCT systems of different working modes are obtained after a series of additions and subtraction operation
Several numbers also differ, and the coefficient number that operating mode with high accuracy calculates is relatively more, and the adder of consuming is relatively more, precision
The coefficient number that low operating mode calculates is fewer, and the adder of consuming is also relatively fewer;
The shift register array is made of 64 12 bit registers, and every 8 are a row, constitute the 8 matrix battle arrays for multiplying 8
Row;It is opened according to current operation mode and needs the register used and carry out shifting function, the precision of operating mode is higher, one-dimensional
The calculated DCT coefficient of DCT modules is more, needs the register shifted more;When row calculating state, shift register battle array
Row store the results of intermediate calculations of one-dimensional DCT modules output line by line, and the data in signal control shift register array are to moving down
It moves until all intermediate result is all stored into shift register array;When column count state, shift register array is by transposition
Results of intermediate calculations afterwards is output to one-dimensional DCT modules by column, and the data in signal control shift register array are moved to the left
Until all intermediate result is all out of shift register array;
Beneficial effects of the present invention are that the hardware complexity of image processor is reduced by the approximate multiplexing of multiplier results
And power consumption;The operating mode that two-dimensional dct image processor can be adjusted according to the accuracy requirement of different occasions, makes it have lower
Energy consumption;
Description of the drawings
Fig. 1 is the two-dimensional dct image processor structural schematic diagram proposed in the present invention;
Fig. 2 is that the state of status control module shifts schematic diagram;
Fig. 3 is data selecting module structural schematic diagram;
Fig. 4 is one-dimensional DCT modular structure schematic diagrams;
Fig. 5 is shift register array structural schematic diagram;
Fig. 6 is conventional shift register logical circuit diagram;
Fig. 7 is pattern shift register concerned logical circuitry.
Specific implementation mode
The present invention is described in detail below in conjunction with the accompanying drawings
The present invention provides a kind of low energy consumption two-dimensional dct image processor of variable precision, passes through multiplier results first
Approximation multiplexing reduces its hardware complexity and power consumption;Then its precision is adjusted by the control to operational precision and operation times
And energy consumption, so that it is switched suitable operating mode in face of different application scenarios, achievees the purpose that reduce energy consumption.
As shown in Figure 1, the two-dimensional dct image processor is by data selecting module, status control module, one-dimensional DCT modules
With this four module compositions of shift register array, 64 image datas of each calculation process.Mode control signal should first
Image processor is configured to specified operating mode, then external control signal by the finite state machine in status control module by
Original state is switched to capable calculating state, and exports corresponding control signal.Data selecting module starts each clock and receives 8
8 outer input datas so continue 8 clocks by these Data expansions to be inputted into one-dimensional DCT modules after 12-bit data
Until 64 image datas are all inputted into data selecting module.One-dimensional DCT modules each clock cycle calculates 8 12 bit images
Data, and 8 12 results of intermediate calculations are obtained, these results are inputted into shift register array and are stored.Displacement
The intermediate result that one-dimensional DCT module arithmetics obtain is moved down according to control signal and is stored by register array line by line, directly
To storing 8 whole row result of calculations.Status control module counts state by going according to the variation of internal control data at this time
Calculation state is switched to column count state, and exports corresponding control signal.Shift register array starts each clock cycle by 8
Intermediate result after a transposition, which is inputted, so continues 8 clock cycle into data selecting module until all intermediate computations knots
Fruit has all been input to data selecting module, and the intermediate result after transposition is entered directly by data selecting module according to control signal
One-dimensional DCT modules are calculated.One-dimensional DCT modules each clock cycle calculate that 8 12 rows calculate as a result, and obtaining 8
A 12 column count results obtain all 64 compressed image datas as final output data after 8 clock cycle.
Fig. 2 is that the state of status control module shifts schematic diagram, is made of three states altogether, is original state respectively,
Row calculating state and column count state.Start is system commencing signal, and start represents system when being 1 can start to calculate
, at this time state machine capable calculating state is jumped to by original state.Enable_row [7] is that the 8th row of shift register array is posted
The row displacement storage enable signal of storage, it is stored into register array for whole results that 1 epoch table row calculates, and is not required to
Any displacement storage operation is done again, and state can be calculated state transition to column count state by row.Enable_column [7] is
Shift register array the 7th arrange row displacement storage enable signal, it for 0 epoch table row calculate whole intermediate results
Register array is removed, 64 image datas are all disposed at this time, and state is by column count state transition to initial shape
State starts the processing of next group of image data.
For the structural schematic diagram of data processing module as shown in figure 3, being mainly made of multiplexer, input is respectively outer
The results of intermediate calculations of the image data and one-dimensional DCT modules of portion's input, wherein outer input data are 8 bit image data, are needed
Symbol Bits Expanding is carried out, becomes inputting again after 12-bit data being calculated into one-dimensional DCT modules.Data select signal comes from shape
State control module, when row calculating state, multiplexer selects externally input image data, more when column count state
Path multiplexer selects the results of intermediate calculations of one-dimensional DCT modules.
Two-dimensional dct transform is defined as follows:
Wherein f (x, y) is the image data before compression, and C (u, v) is compressed DCT coefficient, and α (u) is the letter about u
Number, expression formula are as follows:
Two-dimensional dct transform has stalling characteristic, can be decomposed into the combination of two continuous one-dimensional dct transforms, and one-dimensional DCT becomes
It changes and is defined as follows:
Traditional one-dimensional DCT can be expressed as the form of matrix calculating:
Wherein x0-x7It is input data, W0-W7It is the DCT coefficient being calculated, a=cos (π/16), b=cos (2 π/
16), (3 π/16) c=cos, d=cos (4 π/16), e=cos (5 π/16), f=cos (6 π/16), g=cos (7 π/16).Root
According to above-mentioned matrix carry out calculate need to use 22 full precision multipliers and 30 adders, due to multiplier compared with adder for
Area and power consumption are all bigger, and it is all undesirable that this directly results in performance of traditional one-dimensional DCT circuit in terms of power consumption and area,
Also the performance of conventional two-dimensional DCT image processors is directly affected.Present invention introduces the methods of approximate calculation to reduce multiplier
To 8, the specific method is as follows:All it is to use as can be seen that either calculating even coefficient still calculates strange coefficient from matrix multiplication
Different constants is multiplied by the column matrix on the right, it is possible to which the result that column matrix is multiplied by using one of constant passes through displacement
Mode indicates the operation result of remaining constant, achievees the purpose that save multiplier.It is final true by analogue simulation and error analysis
Fixed idol coefficient is multiplied by the result of column matrix to indicate by constant d:
b≈d+d/4
f≈d/2
W0=d1+d2+d3+d4
W2=d1+d1/4+d2/2-d3/2-d4-d4/4
W4=d1-d2-d3+d4
W6=d1/2-d2-d2/4+d3+d3/4-d4/2
Wherein d1,d2,d3,d4Respectively constant d is multiplied by the right column matrix the first row, the second row, the third line and fourth line
As a result.Strange coefficient is multiplied by the result of column matrix to indicate by constant c:
a≈c+c/4
e≈c/2+c/8
g≈c/4
W1=c1+c1/4+c2+c3/2+c3/8+c4/4
W3=c1-c2/4-c3-c3/4-c4/2-c4/8
W5=c1/2+c1/8-c2-c2/4+c3/4+c4
W7=c1/4-c2/2-c2/8+c3-c4-c4/4
Wherein c1,c2,c3,c4Respectively constant c is multiplied by the right column matrix the first row, the second row, the third line and fourth line
As a result.From W0-W7Expression formula can be seen that the present invention one-dimensional DCT circuit only need 8 multipliers to calculate d0-d4And c0-
c4, and have many identical subexpressions in expression formula, hardware design when can share these subexpressions, into one
Step reduces the use of adder, reduces the area and power consumption of circuit.
As shown in figure 4, one-dimensional DCT modules include the fixation bit wide multiplier of variable precision, adder, subtracter, threshold value
Judgment module and assembly line deposit.Input carries out the addition in addition and subtraction, that is, above-mentioned column matrix and subtracts first
Method operates, and result of calculation is then input to threshold value judgment module.Mode control signal determines current operation mode, threshold decision
Module screens out incongruent input data according to the size of current operation mode decision threshold, reduces rear stage fixation bit wide and multiplies
The operation times of musical instruments used in a Buddhist or Taoist mass achieve the purpose that reduce system energy consumption.The higher pattern of precision chooses relatively small threshold value, is ensureing essence
Operation times are reduced while spending by a small margin, the lower pattern of precision chooses relatively large threshold value, in precision acceptable model
Operation times are greatly reduced in enclosing.It is final to determine that the threshold value of pattern 4 is set as 2 by Multi simulation running simulation and error analysis,
Fixed bit wide multiplier can just be entered when namely input is more than 2 to be calculated, otherwise not calculated and directly by multiplication knot
Fruit is set to 0, while the threshold value of pattern 3 is 3, and the threshold value of pattern 2 is 4, and the threshold value of pattern 1 is 10.Do not pass through threshold in input data
When being worth module, in order to reduce circuit overturning, level-one register is added to be kept fixed original data of bit wide multiplier, also
It is the pipeline register in Fig. 4.The critical path of the image processor can also be shortened by introducing pipeline register, increase work
Working frequency.
There are two types of the patterns of calculating, respectively high precision computation pattern and low essence for the fixation bit wide multiplier of variable precision in figure
Calculating pattern is spent, the calculating pattern of fixed bit wide multiplier is determined by the operating mode of image processor.Precision is relatively high
Pattern 4 and pattern 3 use high precision computation pattern, the relatively low pattern 2 of precision and pattern 1 to use low accuracy computation pattern,
System energy consumption is reduced to achieve the purpose that reduce operational precision.Afterbody be include that the pure combination of adder and subtracter is patrolled
Circuit is collected, the operation result of fixed bit wide multiplier is operated by a series of additions and subtraction, obtains final DCT coefficient.
Because of W0-W7Arithmetic expression in have many identical subexpressions, so this partial circuit uses common subexpression elimination
Method reduces operand from algorithm level, reduces the complexity of circuit.In order to further decrease the energy consumption of system, system work
Precision when reducing calculative DCT coefficient also reduced, need the adder of work and subtracter also accordingly to reduce.
The importance of DCT coefficient is from W0To W7Gradually weaken, so the coefficient of reduction should be W first7, followed by W6, and so on.
4 precision highest of operating mode needs to calculate whole DCT coefficients, and pattern 3 calculates W0-W6, the calculating of pattern 2 W0-W5, precision is worst
Pattern abandon calculating three DCT coefficients, that is, calculate W0-W4。
Fig. 5 shows shift register array structural schematic diagram, and as can be seen from the figure the array is posted by 64 12
Storage forms, and every 8 are a row, constitute 8 matrix arrays for multiplying 8.When row calculating state, shift register array stores one line by line
The results of intermediate calculations of DCT modules output is tieed up, the data in signal control shift register array are moved downwardly until all
Intermediate result is all stored into shift register array.When column count state, shift register array is by the intermediate computations after transposition
As a result it is output to one-dimensional DCT modules by column, the data in signal control shift register array are moved to the left in all
Between result all out of shift register array.There are two types of the shift register of type in register array, one is by pattern
The pattern shift register concerned of signal control, another is the conventional shift register not controlled by mode signal.Fig. 6 is
Conventional shift register logical circuit diagram, the register can receive data from register above, can also posting from the left side
Storage receives data, is selected with a multiplexer, selection signal is provided by status control module.There are two make for it
Energy signal, displacement storage when controlling row calculating state respectively is stored with displacement when column count state, and is controlled by state
Module provides.Fig. 7 is the pattern shift register concerned logical circuitry controlled by mode signal, integrated circuit structure and tradition
Shift register is similar, only difference is that the opening of more mode signal control register and closing.Far Left in array
Three are classified as pattern shift register concerned, store W respectively7, W6, W5These three DCT coefficients, remaining 5 row are all conventional shift deposits
Device.When the image processor of the present invention is operated in pattern 3, design factor W can be abandoned7, at this time mode signal may turn off most
The model specific registers of this row of the left side close two column register of the left side to save energy consumption when being operated in pattern 2, when pattern 1
Close three column register of the left side.
Compared to traditional image processor, the low energy consumption two-dimensional dct image processor of variable precision proposed by the present invention can
To adjust the operating mode of oneself according to the accuracy requirement of different occasions, lower energy is possessed while meeting system requirements
Consumption.When the image processor is switched to pattern 3 from operating mode 4, the PSNR of picture reduces 2.8dB, and energy consumption reduces by 17.4%;From
When operating mode 3 is switched to pattern 2, the PSNR of picture reduces 2.6dB, and energy consumption reduces by 21.4%;It is switched to mould from operating mode 2
When formula 1, the PSNR of picture reduces 2.9dB, and energy consumption reduces by 26.3%.
Claims (1)
1. a kind of two-dimensional dct image processor, which includes data selecting module, status control module, one-dimensional DCT
Module and shift register array;The data input pin of the data selecting module connects outer input data and shift register battle array
The control signal input of the output end of row, data selecting module connects the output end of status control module;The state controls mould
The control signal input of block connects external control signal, the control signal input sum number of output termination shift register array
According to the control signal input of selecting module;The data input pin of the one-dimensional DCT modules connects the output end of data selecting module,
The control signal input of one-dimensional DCT modules connects external control signal, and the output of one-dimensional DCT modules terminates shift register array
Data input pin, and be also whole image processor data output end;
The status control module is made of a finite state machine, and the state of the finite state machine is counted by original state, row
Calculation state and column count state are constituted, and finite state machine can be cut according to external control signal and internal data variation carry out state
It changes;The input data source of the one-dimensional DCT modules of the control signal deciding that finite state machine is exported and shift register battle array
The moving direction of row;
The data selecting module inputs the data into one-dimensional DCT modules according to control signal deciding, when row calculating state, control
Signal behavior outer input data enters one-dimensional DCT modules, and when column count state, control signal behavior shift register array is defeated
The data of outlet enter one-dimensional DCT modules;
The one-dimensional DCT modules include sequentially connected adder-subtractor, threshold value judgment module, pipeline register, variable precision
Fixation bit wide multiplier, adder-subtractor;The fixation bit wide multiplier of the variable precision is respectively high there are two types of the pattern of calculating
Accuracy computation pattern and low accuracy computation pattern;Outer input data, which inputs after first order addition and subtraction into threshold value, to be sentenced
Disconnected module, the threshold value judgment module are only more than threshold value just according to current operation mode decision threshold size, the data of input
It can be sent into pipeline register, being input into fixed bit wide multiplier in next clock continues to calculate;Fixed bit wide multiplier
As a result final result of calculation is obtained after being operated using addition and subtraction;
The shift register array is made of 64 12 bit registers, and every 8 are a row, constitute 8 matrix arrays for multiplying 8;Root
It is opened according to current operation mode and needs the register used and carry out shifting function, the precision of operating mode is higher, one-dimensional DCT moulds
The calculated DCT coefficient of block is more, needs the register shifted more;The operating mode altogether there are four types of, respectively mould
Formula 4, pattern 3, pattern 2 and pattern 1, precision and energy consumption slave pattern 4 are continuously decreased to pattern 1;Fixed bit is determined by operating mode
The calculating pattern of wide multiplier, the relatively high pattern 4 of precision and pattern 3 use high precision computation pattern, precision relatively low
Pattern 2 and pattern 1 use low accuracy computation pattern;When row calculating state, shift register array stores one-dimensional DCT moulds line by line
The results of intermediate calculations of block output, the data that signal controls in shift register array are moved downwardly until all intermediate result
It all stores into shift register array;When column count state, shift register array by the results of intermediate calculations after transposition by column
Be output to one-dimensional DCT modules, the data in signal control shift register array be moved to the left until all intermediate result all
Out of shift register array.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610529240.XA CN106028049B (en) | 2016-07-06 | 2016-07-06 | A kind of two-dimensional dct image processor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610529240.XA CN106028049B (en) | 2016-07-06 | 2016-07-06 | A kind of two-dimensional dct image processor |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106028049A CN106028049A (en) | 2016-10-12 |
CN106028049B true CN106028049B (en) | 2018-11-13 |
Family
ID=57107725
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610529240.XA Expired - Fee Related CN106028049B (en) | 2016-07-06 | 2016-07-06 | A kind of two-dimensional dct image processor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106028049B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107018420B (en) * | 2017-05-08 | 2019-07-12 | 电子科技大学 | A kind of low-power consumption two-dimension discrete cosine transform method and its circuit |
CN108040257A (en) * | 2017-11-20 | 2018-05-15 | 深圳市维海德技术股份有限公司 | A kind of two-dimensional dct Hardware Implementation and device |
CN109451307B (en) * | 2018-11-26 | 2021-01-08 | 电子科技大学 | One-dimensional DCT operation method and DCT transformation device based on approximate coefficient |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102857756A (en) * | 2012-07-19 | 2013-01-02 | 西安电子科技大学 | Transfer coder adaptive to high efficiency video coding (HEVC) standard |
CN103369326A (en) * | 2013-07-05 | 2013-10-23 | 西安电子科技大学 | Transition coder applicable to HEVC ( high efficiency video coding) standards |
-
2016
- 2016-07-06 CN CN201610529240.XA patent/CN106028049B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102857756A (en) * | 2012-07-19 | 2013-01-02 | 西安电子科技大学 | Transfer coder adaptive to high efficiency video coding (HEVC) standard |
CN103369326A (en) * | 2013-07-05 | 2013-10-23 | 西安电子科技大学 | Transition coder applicable to HEVC ( high efficiency video coding) standards |
Also Published As
Publication number | Publication date |
---|---|
CN106028049A (en) | 2016-10-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111667051B (en) | Neural network accelerator applicable to edge equipment and neural network acceleration calculation method | |
US11775801B2 (en) | Neural processor | |
Jang et al. | Sparsity-aware and re-configurable NPU architecture for Samsung flagship mobile SoC | |
CN106028049B (en) | A kind of two-dimensional dct image processor | |
CN106897046B (en) | A kind of fixed-point multiply-accumulator | |
JP7309694B2 (en) | Systems and methods for implementing neural networks in integrated circuits | |
Zhang et al. | Fast: Dnn training under variable precision block floating point with stochastic rounding | |
CN110058840A (en) | A kind of low-consumption multiplier based on 4-Booth coding | |
WO2020215124A1 (en) | An improved hardware primitive for implementations of deep neural networks | |
CN107544942A (en) | A kind of VLSI design methods of Fast Fourier Transform (FFT) | |
CN110162742A (en) | The floating-point operation circuit implementing method that real number matrix is inverted | |
Chen et al. | A throughput-optimized channel-oriented processing element array for convolutional neural networks | |
Kim et al. | An energy-efficient GAN accelerator with on-chip training for domain-specific optimization | |
CN113837365A (en) | Model for realizing sigmoid function approximation, FPGA circuit and working method | |
CN107092462B (en) | 64-bit asynchronous multiplier based on FPGA | |
Tsai et al. | An on-chip fully connected neural network training hardware accelerator based on brain float point and sparsity awareness | |
Yang et al. | A reconfigurable cnn accelerator using tile-by-tile computing and dynamic adaptive data truncation | |
CN116822616A (en) | Device for training Softmax function in large language model | |
Wang et al. | A DSP48-based reconfigurable 2-D convolver on FPGA | |
CN109948787A (en) | Arithmetic unit, chip and method for neural network convolutional layer | |
Kang et al. | Design of convolution operation accelerator based on FPGA | |
Zhao et al. | Optimizing FPGA-Based DNN accelerator with shared exponential floating-point format | |
CN112001492A (en) | Mixed flow type acceleration framework and acceleration method for binary weight Densenet model | |
Hrabovsky et al. | Systolic-based 2D convolver for CNN in FPGA | |
Cain et al. | Convolution processing unit featuring adaptive precision using dynamic reconfiguration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20181113 Termination date: 20210706 |