A kind of novel array video signal processing unit structure
1. technical field
The present invention relates to the structure of processing unit in a kind of novel array video signal processor.Invention is used the active demand to high-performance, programmable digital signal processor to vision signal processing and 3-D view processing etc., based on large-scale parallel place column processor, has proposed a kind of novel processing unit architecture.Belong to the IC design field.
2. background technology
The number of arrays word signal processor that the present invention relates to can be widely used in the high-performance video signal processing field.Owing to have high-performance and restructural, programmability simultaneously; So can adapt to the coding and decoding video algorithm of various main flows such as MPEG-2/MPEG-4/H.263/Divx/H.264/AVS/VC-1/RV/MJPEG; Can also be widely used in various video pre-treatments and reprocessing occasion, for electronic multimedia products such as high-performance video converter and server, high-end DVD, DTV provide powerful video/digital signal processing capability.The more important thing is that the number of arrays word signal processor not only has low cost, low-power consumption and high performance advantage also have great adaptability, can realize following potential new video/digital signal processing algorithm.So application prospect is very wide.
The present invention handles towards multimedia and uses; The architecture of the ARRAY PROCESSING unit of a kind of dynamic reconfigurable, easy programming is proposed; Both had ASIC high-performance, low-power consumption, advantage cheaply, had the advantage that Programmable DSPs is designed and developed flexibly, the construction cycle is short again
The present invention can be widely used in consumer electronics market, comprises that high performance digital signal is handled and field programmable logic is used.Be particularly useful for low side and high-end audio/video consumer device.The characteristics of low-power consumption, low cost and high reliability make the present invention have the very strong market competitiveness.
3. summary of the invention
To the effect that of the present invention: a kind of novel AP basic processing unit structure that has proposed is as shown in Figure 1, comprises an arithmetic operator, two input selectors, two input registers, a fan-out controller and four configurable buffers.In the structure shown in Figure 1, can realize reconstruct and programming to arithmetic element through changing configuration register.
Basic processing unit can be realized arithmetical operation operation, compare operation, shifting function, conversion operations, logical operation and other dedicated operations.These operations can be combined into operator, and the arithmetic sum logical operator can realize that large-scale parallel calculates through programming.Each operator can be carried out some shirtsleeve operations, the operation that we have selected suitable video reprocessing and other Digital Signal Processing to use meticulously.
Operator can further be formed compute cluster, and is as shown in Figure 2.Operator in the same compute cluster can realize more complicated function through cascade.What operator was formed bunch interconnects through switching network.A plurality of bunches can constitute the AP of dynamic reconfigurable with switching network.
The basic structure of each processing unit is as shown in Figure 3 in the AP, wherein:
● ALU (ALU, Arithmetic Logic Unit): can carry out the arithmetic sum logical operation as: add, subtract, multiplication and division, comparison, displacement, with or, non-etc..
● local program memory: be used for storing a spot of instruction;
● register: preliminary consideration 8 local registers of design and 4 pairs of registers that are used for port interconnection.
Each basic compute cluster comprises 9 arithmetic elements, and 8 such base varieties and some storage bank connect to form the compute cluster of low level through localized network, and are as shown in Figure 4.Typical storage bank is the 8-12 bit wide, and the 256-2048 position is dark.The compute cluster of low level can constitute high-level compute cluster through interconnection.Compute cluster is a hierarchical structure, and multilayer can be arranged.Equally, also can realize the reconstruct and the programming of compute cluster level in the structure shown in Figure 4 through the interference networks in the configuration bunch.
4. description of drawings
Fig. 1 is the basic processing unit architecture.
Fig. 2 is the architecture that basic processing unit constitutes compute cluster.
Fig. 3 is the architecture of processing unit in the AP.
Fig. 4 is the architecture that processing unit is formed compute cluster.
Fig. 5 is the AP example architecture.
5. embodiment
As everyone knows, the application-specific integrated circuit (ASIC) (ASIC) that designs for the particular video frequency processing capacity has very high performance, generally is dozens or even hundreds of times of Programmable DSPs performance.For this reason, we have anatomized the characteristics that ASIC realizes, recognize: used operation types is limited in the computing, and a few operation is used times without number; The fan-out of operation is fewer often; Through the cascade and the combination of operation, can effectively improve the utilance of resource; Local interlinkage between the unit is effectively, and adopting the large-scale switching network is to waste very much resource.
Based on above-mentioned cognition, we have designed array processor structure as shown in Figure 5.In this structure, a plurality of low levels shown in Figure 4 bunch have constituted high-level bunch through network interconnection.High-level bunch can constitute higher level compute cluster through local interlinkage, thereby forms stratification Structure Calculation bunch.
The AP that we propose is very similar with static data flow processor structure, is the stream handle of a static state on microcosmic, is a dynamic stream handle on macroscopic view.But,, overcome the not high deficiency of conventional flow processor efficient through adopting reconfigurable hardwired and reconfigurable arithmetic unit.