CN101218604A - Image processing circuit with block accessible buffer memory - Google Patents

Image processing circuit with block accessible buffer memory Download PDF

Info

Publication number
CN101218604A
CN101218604A CNA2006800251323A CN200680025132A CN101218604A CN 101218604 A CN101218604 A CN 101218604A CN A2006800251323 A CNA2006800251323 A CN A2006800251323A CN 200680025132 A CN200680025132 A CN 200680025132A CN 101218604 A CN101218604 A CN 101218604A
Authority
CN
China
Prior art keywords
pixel value
circuit
shift
memory
row
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2006800251323A
Other languages
Chinese (zh)
Inventor
卡洛斯·A·阿尔巴平托
拉马纳坦·塞托拉曼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Koninklijke Philips NV
Original Assignee
Koninklijke Philips Electronics NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics NV filed Critical Koninklijke Philips Electronics NV
Publication of CN101218604A publication Critical patent/CN101218604A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Input (AREA)
  • Image Processing (AREA)

Abstract

An image processing circuit has a buffer memory (14) for storing pixel values for pixel locations in a two-dimensional moveable window within an image. The buffer memory (14) comprises a plurality of functional rows of memory circuits (30) for storing pixel values from the window. A plurality of access ports (17) is provided, each for providing access to an addressable pixel value from a respective group of pixel values from respective ones of the rows. Shift circuits (32) are provided between the memory circuits (30) and the access ports, or as part of the arrangement of memory circuits (3). Each shift circuit is provided for a respective row and arranged to shift assignment of pixel values from the respective row to the groups. An addressing circuit (15) has inputs for receiving an address of a two-dimensional block of pixel locations and a mode signal indicative of a dimensions of the block. The addressing circuit (15) controls the shift circuits (32) to set respective amounts of the shift for respective ones of the rows dependent on the dimensions indicated by the mode signal. The addressing circuit sets the amounts of shift to a values whereby pixel values for respective lines of pixel locations in the block that are stored in different ones of the rows are assigned to mutually non-overlapping groups. In this way the pixel values of the block can be accessed in parallel.

Description

Image processing circuit with block accessible buffer memory
Technical field
The present invention relates to the method for a kind of image processing circuit and processing image.
Background technology
Be well known that, provide a kind of image processing system with memory buffer to be provided to fast access from the storage pixel value of the rectangular window of the location of pixels of big image.Popular Flame Image Process tasks such as coupling such as the adjacent blocks of calculating, two-dimensional filtering and the pixel of DCT (discrete cosine transform) need repeat same operation, are applied to the pixel value of the different window of location of pixels at every turn.For the execution of each operation, the pixel value of respective window is retained in the memory buffer, therefore can visit these pixel values rapidly as the part of operation.
Before carrying out next operation, upgrade the content of memory buffer.Typically, use sliding window, this sliding window moves based on a predeterminated level distance of carrying out another execution from operation.In this case, the pixel value of common only up-to-date load image right column is to replace the pixel value of left column.
In order to support this part to replace, preferably in memory buffer, use the cyclic addressing of certain form.For example, this can utilize the relative window X-Y address cycle address of certain form to realize to translating of storage address.Therefore, for the window that moves horizontally, identical storage address is the X-Y address of translating by the value of continuous minimizing X.When the X value has reduced to outside the window, load new pixel value and be used for storage address, and the X value is increased the window size.
Certainly, can use other scheme to support part to replace.As selection, old pixel value can be moved to the memory location of the displacement between each executable operations of different window, and new pixel value is loaded into the memory location of vacating.That is, the homologous lines for image uses shift register effectively.In this case, can use the position that is used for the addressing shift register is fixedly translated to memory location in the X-Y address.
Favourable, the memory buffer of Flame Image Process provides the concurrent access of a plurality of pixel values, for example horizontal pixel value of location of pixels in the concurrent access window.Like this, can use the parallel processor circuit, each processor circuit is used for handling concurrently with other treatment circuit one or more pixel values of respective pixel position or location of pixels group.
Typically, depend on size and the structure that memory buffer is selected in the operation that must carry out.If the pixel value piece of 16 * 16 location of pixels on action need X and the Y direction, then the typical case uses the memory buffer of 256 (16 * 16) memory location, and preferably, the X-Y address of operation comprises two 4 bit positions that are used for respect to window X and Y position being carried out addressing respectively.
But not every Processing tasks all needs the pixel value of onesize window.Certain operations needs the data of the window of 8 * 8 location of pixels; Other then needs the data of window of 16 * 16 location of pixels or 9 * 9 or 17 * 17 location of pixels etc.For fast processing, wish that memory buffer has all pixel values that maximum required window is stored in enough positions.
In this case, when carrying out need be than the operation of fenestella the time, only use the partial pixel value.When memory buffer is supported the concurrent access pixel value of location of pixels line, only use the part parallel access ability.When providing, only use the part of these processor circuits with the corresponding a plurality of parallel processor circuit of the window of maximum possible size.
Summary of the invention
Wherein, the objective of the invention is to increase the utilization rate of the processing resource in the image processing system, this image processing system is provided to the concurrent access of a plurality of memory locations of memory buffer of the pixel value of the two-dimentional at least window that is used for the storage pixel position.
The invention provides a kind of image processing circuit according to claim 1.The invention provides the piece that during accesses buffer, uses different size.To use term " piece " to refer to the set of location of pixels in the location of pixels window at this.Typically, the pixel value of the position of concurrent access piece.Provide corresponding shift circuit at the corresponding function of memory buffer is capable, changing access port, by this access port pixel value of access line independently of one another.Depend on the mode signal of the size of having indicated accessed, select the shift amount of different rows.The shift amount that is used by circuit has the value that meets the following conditions: the pixel value of homologous lines that can concurrent access be stored in the location of pixels of the piece in the different rows.Therefore, be available if for example use piece and N access port with N/m pixel wide, then when the visit of the pixel value that will arrive different rows is shifted the N/m integral multiple relative to each other, can a concurrent access N pixel.Preferably, the function row is implemented as on the integrated circuit how much row in the memory matrix, but to be not limited to also be the function row of how much row in the present invention.
Preferably, realize displacement by the function row of realizing memory buffer, each memory buffer is a circular shift register, transmits pixel value to realize visit by different access ports along shift register.Can use the chain of registers (chain) of simple series connection.Selectively, can use multiplexer that selectable shift step is provided between the register in shift register, to quicken displacement.In another embodiment, can use non-circular shift registers, but in this case, typically need bigger shift register, it is wider than the row of window, pixel value is displaced to " invisible " with maintenance.
The example that use is not limited to embodiment is described these and other objects of the present invention and favourable aspect.
Description of drawings
Fig. 1 shows image processing circuit;
Fig. 2 a-b has demonstrated the also relation between the line output of location of pixels and pixel;
Fig. 3 shows memory buffer;
Fig. 3 a shows the displacement control section;
Fig. 3 b shows row addressing part;
Fig. 4 shows shift register;
Fig. 5 shows memory buffer;
Fig. 6 shows another displacement control section;
Fig. 7 a-b shows part (fraction) storage of image line (line).
Embodiment
Fig. 1 shows image processing circuit, comprises primary memory 10, memory interface 12, memory buffer 14, addressing circuit 15, a plurality of treatment circuit 16 and control circuit 18.Memory buffer 14 is connected to primary memory 10 by interface 12.Memory buffer 14 has the parallel data access port 17 that is connected to a corresponding treatment circuit 16.The control output that control circuit 18 has the instruction output that is connected to treatment circuit 16, the address/control output that is connected to addressing circuit 15 and is connected to interface 12.Addressing circuit 15 has the output of memory buffer of being connected to 14.
Although show three access ports 17 and treatment circuit 16, should be appreciated that can walk abreast provides access port 17 and treatment circuit 16 such as any number of 8 or 16.
In an embodiment, control circuit comprises that the command memory and the programmable counter that are used for stored program instruction come these instructions of addressing.These instructions comprise the instruction that comprises command component and address portion.In this embodiment, control circuit is configured to the address portion of sending instruction is sent into addressing circuit 15, and the command component of this instruction is sent to treatment circuit 16.In addition, programmable counter is connected to interface circuit 12, is used for triggering in the precalculated position of instruction cycle the transmission between the storer 14 at primary memory 10 and buffering.In this embodiment, treatment circuit 16 is the SIMD treatment circuit, is configured to receive the same instructions from control circuit 18.But it should be understood that and the invention is not restricted to this embodiment.Can use other form to come to provide the operation of address and control processing circuit to addressing circuit 15.In addition, be connected to an access port 17, should be appreciated that although each treatment circuit 16 is shown, in another embodiment, each treatment circuit 16 can be connected to a plurality of access ports, for example, be connected to its neighboring port, perhaps even be connected to the port adjacent etc. with its neighboring port.When access port was read port, this permission was carried out more complicated operations at each treatment circuit 16.In another embodiment, the group of treatment circuit 16 can be replaced by corresponding bigger treatment circuit, its each have the input that is connected to a plurality of access ports 17, to carry out more complicated operations.
In operation, 18 pairs of treatment circuits 16 of control circuit provide instruction, provide relative window address and mode signal to addressing circuit 15.Typically, control circuit provides the pixel value of a relative window address to the address at the most for all treatment circuits 16.Addressing circuit 15 translates to the control signal that is used for memory buffer 14 with the combination of each relative window address and mode signal.
Memory buffer 14 can be configured to support from treatment circuit 16 read and writes, and is perhaps read-only or only write.When read and write all can the time, control circuit also sends read to addressing circuit 15 relatively with relative window address.To be that example is discussed with the read operation.In response to the address that is used for read operation, memory buffer 14 is used for the pixel value of a plurality of location of pixels from self retrieval, and at port one 7 places and these pixel values of line output.
Memory buffer 14 and addressing circuit 15 support the location of pixels of the pixel value that loads from primary memory 14 and these pixel values during the read operation will by and the combination of line output between relation control.
Fig. 2 a-b demonstrated location of pixels and during first and second operator schemes respectively and the relation between the combination of the pixel value of line output, for example, have four access ports, 17 treatment circuits 16.In first pattern, and the pixel value of a line segment (Y address) of four contiguous pixels positions of line output (four X addresses).In this first pattern, corresponding with relative window XY address such as the leftmost location of pixels of line segment from the relative window address of control circuit 18.In second pattern, and the pixel value of two continuous segments (two Y addresses) of each two location of pixels of line output (two X addresses).In this second pattern, corresponding with relative window XY address such as the leftmost location of pixels of line segment on upper strata from the relative window address of control circuit 18.
For example, first pattern can be used to handle the pixel value of the wide piece of four location of pixels.In this pattern, during executing instruction, the pixel value of the line segment concurrent access respective pixel position of each treatment circuit 16 in the piece.For example, second pattern can be used to handle the pixel value of the wide piece of two location of pixels.In this pattern, during executing instruction, it respectively is two groups of two treatment circuits 16 that treatment circuit 16 is divided into.Execution result be control circuit 18 with relative window address applications between the order period of addressing circuit, the pixel value of first group access, first line segment of two treatment circuits 16.Carry out same instruction and having between the fruiting period of the relative window of same application address the pixel value of second group access, second line segment of treatment circuit 16.Thereby all treatment circuits 16 can be used for this two kinds of piece width (both block).If along the number of the location of pixels of piece Width number, then needn't stay and use part access port 17 and treatment circuit 16 and do not use less than access port 17 and treatment circuit 16.
Fig. 3 has demonstrated the embodiment of memory buffer 14 and addressing circuit 15.As example, show small-sized memory buffer 14, but should be appreciated that with pixel value 4 * 4 storage spaces, in fact can use bigger memory buffer.In this embodiment, memory buffer 14 comprises a plurality of circular shift register circuits 30.Multiplex circuit 32 is provided, and is organized as each and corresponding group of access port separately.Each group of multiplex circuit 32 is connected between the output and its corresponding access port 17 of shift register 30 separately.
Addressing circuit 15 has the input M that is used to receive input X, the Y of X, Y address and is used for the receiving mode signal.Addressing circuit 15 has the independent control output of the displacement control input of the circular shift register 30 that is connected to separately.In addition, addressing circuit 15 has the independent control output of the multiplex circuit of the access port 17 that is connected to separately.
Addressing circuit 15 is configured to depend on from mode signal and imports the mode signal of M and make shift register 30 displacements, and therefore, the displacement in the different shift registers can be different mutually.The mode signal indication therefrom needs the width W of the piece of pixel value.The respective integer that adds W based on any common divisor C of displacement doubly and modN is shifted to the pixel value of continuous shift register 30, and wherein N is the number (being stored in the width of the window in the memory buffer) that is stored in the pixel value in each shift register 30.In one embodiment:
The displacement that i is capable=C+i*W modulo N
The common divisor C of displacement is identical to all shift registers 30, for example, can be 0, C=0.Therefore, the mode signal of the memory buffer for 4 * 4, the example of N=4 and indicator dog width W=2, the pixel value in the shift register 30 selectively is shifted by 0 and 2 positions continuously.
Fig. 3 a has demonstrated the embodiment of the displacement control section of addressing circuit 15, comprises lut circuits 34, and this lut circuits 34 has the input of receiving mode signal " M " and is used for the output of displacement control signal of the respective column of memory buffer 14 (not shown).Look-up table (LUT) circuit 34 may be embodied as storer, and this storer is used as the address with mode signal M, and for example according to aforesaid relation, is being used for place, the address storage control signal collection of corresponding modes signal value.Selectively, can use the logical circuit that produces same input one output relation, perhaps use computing circuit to carry out the computing of control signal.
Multiplex circuit 32 may be embodied as the switch that the output of selected shift register 30 is connected to access port 17.Addressing circuit 15 is configured to use the relative window XY address signal from address signal input X, Y, produces the control signal that is used for multiplex circuit 32.Produce control signal, so that the output that will comprise from the multiplex circuit of the pixel of selected block is connected to access port 17.For each access port 17, select a multiplex circuit 32.Because a plurality of row of memory buffer are displacement mutually, so can do like this.For each relevant row of memory buffer, output is used for W output of the shift register 30 of those row.
For example, in an embodiment, i is capable to be shifted by i*W mod N, if relatively the window address is X, Y, then addressing circuit 15 begins to select the W row from row X+Y*Wmod N in that the Y of memory buffer 14 is capable, then is classified as the 0th if desired automatically and is listed as.If W is less than N, addressing circuit 15 also begins to select the W row from row X+ (Y+1) * W mod N in that Y+1 is capable, then is classified as the 0th row if desired automatically.Preferably, if possible, addressing circuit begins to select even multirow more from row X+ (Y+j) * W mod N in the W position: Y+j is capable, j=0, and 1 ... then be classified as the 0th row if desired automatically.In this case, the value of j should satisfy j=0 ... j Max-1, j wherein Max* W is less than or equal to N.
Fig. 3 b has demonstrated and has been used to different lines to produce the row addressing part of address.Provide look-up table (LUT) circuit 36 at the required address offset of continuous access port 17 outputs.Mode signal M, X address and (part) Y address are as the input of lut circuits 36.Lut circuits is configured to according to mode signal and relative window X, Y address value, for respective column produces off-set value.Adder circuit 38 is configured to the output addition with Y address and lut circuits 36.The a variety of realizations that are appreciated that addressing circuit 15 all can realize addressing.Adder circuit 38 can with lut circuits 36 and/or the integrated selection signal that produces each multiplex circuit of decoding circuit.In another embodiment, for providing computing circuit, each access port 17 calculates the row of the memory buffer that be connected to this access port 17.
Provide in the above among the embodiment of example, row addressing part is computes integer row addresses Y+j effectively, and it satisfies condition
P=X+k+(Y+j)*W?modulo?N
Wherein, P is a port numbers, 0 and W-1 between k along the different pixel value of the row counting in the piece, and j is 0 and j Max-1Between.
Therefore, example for 4 * 4 memory buffer 14 (N=4) of two location of pixels width piece W=2, if window XY address is X=2 relatively, Y=2, then addressing circuit 15 will be controlled the multiplex circuit 32 of second row, be connected to the 0th and the 1st access port (for k=0 and 1, P is respectively 0 and 1, corresponding to j=0) with shift register 30 with this row.In addition, addressing circuit 15 will be controlled the multiplex circuit 32 of the 3rd row, be connected to the 2nd and the 3rd access port (for k=0 and 1, P is respectively 2 and 3, corresponding to j=1) with the shift register 30 with this row.
In this embodiment, addressing circuit 15 width W that only depends on piece is controlled the shift amount in every row.This complicates address arithmetic.In addition, it has the effect that access port depends on the X and the Y address of piece, and this access port is provided to the visit of pixel value to the top left pixel position of this piece.This location of pixels is outputed to X+Y*W port in the access port 17.In another embodiment, the shift circuit (not shown) is connected between access port 17 and the multiplex circuit 32, so that pixel value is in X+Y*W position cocycle displacement, so that scheduled visit port one 7 is provided to the visit of pixel value to the intended pixel position in the relative piece upper left corner.Therefore, treatment circuit 16 can be suitable for this pixel value.
In alternate embodiments, be adjusted into Y address and/or X address by shift amount with shift register, eliminated dependence to Y address.For example, LUT circuit 34 can have the input that is used for Y address signal and/or X address signal and realizes this purpose.In this case, addressing circuit 15 can be configured to make pixel value displacement (i-Y) * W-X mod N in the i shift register 30, and wherein X is relative window X and the Y address that is used for top left pixel addressing memory buffer 14, piece position with Y.Therefore, at access port 17, corresponding with predetermined access port corresponding to the location of pixels of intended pixel position in addressed block.In this case, this does not need other shift circuit to realize at access port 17.In alternate embodiments, addressing circuit can be controlled the displacement of i shift register 30 according to (i-Y) * W mod N.In this case, can by Y address add the independent offset of Y address determine row the address.In another embodiment, perhaps add X mod N, thereby realized that access port 17 only depends on the Y position in the piece.
In one embodiment, addressing circuit 15 calculates the selection signal of decoding and different selection signals is outputed to each multiplex circuit 32, so that the ON/OFF control of access port to be provided.Selectively, addressing circuit 15 can provide the address signal of decoding to the multiplex circuit 32 of public row, and multiplex circuit 32 also should be configured to decode in this case.
Can the number of shift cycles of data shift control can be controlled displacement by making each shift register 30.In one embodiment, addressing circuit makes i shift register 30 can carry out N shift cycle of i*W mod.
Fig. 4 shows the selectable shift-register circuit of the row that is used for memory buffer 14.Shift-register circuit 40 comprises the register that is inserted between the register 42 and the circular series of multiplexer 44.Each multiplexer 44 has first input, and described first input is connected to the output of previous register 42 and the output that further retreats the register 42 of predetermined number destination locations along this sequence.Addressing circuit 15 (not shown) control multiplexer 44 and register 42.Shift step multiplexer 44 can mask register 40 in the displacement of a position, perhaps select based on a plurality of positions in the single shift cycle.Therefore, can be used for the multiplexer 44 of selected shift register 40, pixel value is transmitted along this sequence backward from the precalculated position number, realize the required number of shift cycles of a large amount of displacements to reduce by control.
It should be understood that and to realize further reducing number of shift cycles by using multiplex circuit 44 that along the circular series of register 42, described multiplex circuit has output a plurality of that are connected to register 42 with respective predetermined distance and selects input.In the most extreme form, each multiplexer 44 has the input can selected, and describedly selects to import that be connected to can be from the output of each register 42 of its shifted pixels value.In this case, multiplexer 44 in every row of addressing circuit 35 control memory buffer, with with pixel value along cyclic sequence from transmit (for example, from distance i*W, i is the row number of shift register 40 in the memory buffer 14) at a distance of the register 42 of required separation distance.Therefore, single shift cycle is for being enough along each shift register 40 shifted data.In another embodiment, only allow limited shift amount (for example), and provide the selectable input of multiplexer 44, but then do not provide for other shift amount for all these shift amounts according to different block sizes.
In an embodiment, control circuit 18 is configured to provide new mode control signal to each accessing operation.In this case, preferably, each access instruction of being handled by control circuit comprises the field that control circuit offers the mode control signal value of addressing circuit 15.In alternate embodiments, can mode control signal be set at a plurality of accessing operations, for example up to being provided with new mode control signal value.In this case, the instruction of separation can be used at subsequently access instruction the mode signal value being set.Preferably, in this alternate embodiments, response is provided with the instruction of new mode control signal value, and the pixel value in the shift register 30 (or 40) is shifted.
In two embodiment, during carrying out same task, can carry out and use the different block sizes and/or the dissimilar processing of shape, go into memory buffer and adapt to different block sizes and need not heavy duty.
Fig. 5 shows an embodiment, and the register that wherein is connected to the shift circuit of being controlled by addressing circuit 15 52 (for example barrel shifter) has replaced shift register 30 or 40 for 50 groups.The output of shift circuit 52 is by being connected to access port 17 with multiplex circuit 32 similar multiplex circuits 54 (not being shown specifically).
In operation, the memory buffer 14 of Fig. 5 is utilized the circuit in the shift circuit 52 to switch and is carried out line correlation displacement (row dependent shifting), that is, need not the register transfer of data from shift register chain (chain) to another.This can improve the speed of visit, but it is need be when using shift register more complicated to the more visit of miniature circuit.
In one embodiment, for each accessing operation, addressing circuit 15 makes shift register 30 (or 40) with the pixel value required shift amount of accessing operation that is shifted, and carries out with identical amount again after accessing operation.
Fig. 6 shows the embodiment of the displacement control section of addressing circuit 15, wherein shifted pixels value before accessing operation only.In this embodiment, register 60 is provided for the last mode control signal that uses of storage and is used to export the LUT circuit 34 of displacement control signal (or have any other circuit of the same input/output function), if exist, the difference of the new shift amount that described LUT circuit 34 and existing shift amount (by old mode control signal value indication) and new accessing operation are required is corresponding.After having controlled displacement, upgrade the mode control signal value of being stored.Therefore, the number of times of shifting function can minimize, and this can save power and time.When shift amount also depends on X and/or Y address, preferably X and/or Y address also are stored in register 60 and are provided for LUT circuit 34.
In another embodiment, addressing circuit 35 is configured to forbid the displacement in the shift register that is not addressed.If this embodiment and the embodiment that uses the difference displacement make up, the shift amount before that then preferably provides register to represent respective column, and at shift amount before this, displacement is controlled in the variation that depends on displacement.
Interface circuit 12 is supported in the transmission of pixel value between primary memory 10 and the buffering storer 10.Can use any mechanism to control transmission.In one embodiment, transmission is linked in the operation cycle of treatment circuit 16.In this embodiment, control circuit sends signal to interface circuit 12 at first, and this signal triggering interface circuit 12 is loaded into the memory buffer from the pixel value of primary memory 10 with the window of location of pixels.Next control circuit 18 starts cycle of treatment, during this operation cycle when finishing when visit, control circuit 18 transmits a signal to interface circuit 12, and this signal triggering interface circuit 12 is loaded into the memory buffer from the other row pixel value of primary memory 10 with the window of location of pixels.In this case, the displacement mechanism of memory buffer 14 can be used for realizing and can write new pixel value at predetermined register.It should be noted that this only needs same shift amount for all row, pattern does not rely on shift amount during operation.
Although in one embodiment, interface circuit 12 is configured to only from primary memory 10 pixel value is sent to memory buffer 14, in alternate embodiments, interface circuit 12 can be configured to only pixel value is sent to primary memory 10 (after handling) from buffering storer 14, perhaps before handling, pixel value is sent to memory buffer 14 from primary memory 10, and after handling, passes primary memory 10 back from buffering storer 14.
It should be noted, can use the transmission of other form between primary memory 10 and buffering storer 14.Replacement is received operation cycle (for example being triggered by program counter value) with conveyer chain, responds explicit move instruction and transmits, and perhaps as the result of addressing, uses external buffer memory.
Control circuit 18 utilizes relative window X Y address to come pixel value is carried out the embodiment of addressing although the present invention has described wherein, and alternate embodiments can be used definitely (with respect to image) XY address.In this alternate embodiments, can provide translation circuit that specific address is converted to relative window X Y address, suppose that the XY address of window is stored in the memory buffer.Selectively, can directly calculate home address (optional is shift amount) the memory buffer from specific address.
Having utilized between the row of wherein image line (image line) and memory buffer 14 exists the example of one to describe the present invention.Yet, replacedly, can be stored in the delegation more than the pixel value of an image line, perhaps the pixel value of an image line can distributed storage in a plurality of row of memory buffer 14.Preferably, in this case, from the number that is stored in the location of pixels of image line in the memory buffer 14 is the integral multiple of memory location number in the row, perhaps on the contrary, the number of memory location is from the integral multiple that is stored in the location of pixels of image line in the memory buffer 14 in the delegation.In another embodiment, circuit is not limited to these integral multiples.
Fig. 7 a-b shows the part storage of image line of example that memory buffer wherein has the row of 6 access ports and 6 memory locations.There is shown with memory buffer 14 in the row (shift register 30 or 40) of memory location and the matrix of the corresponding row and column of row (access port 17).The pixel value of the line segment of location of pixels is stored in the memory location in the image.The memory location of the pixel value of high order end location of pixels is indicated by circle in the storage line segment.Under the situation of part storage, as shown in the figure, these memory locations are not in same row.Also pointed out the position of pixel value of the line segment of rectangle frame among the figure.Fig. 7 a demonstrated have cross, the memory location of 4 * 2 pixel value of location of pixels that trigonometric sum is square.Similarly, Fig. 7 b has demonstrated the memory location of 3 * 6 pixel value of location of pixels.
Now, when piece that must the selected size of visit, mode signal is offered addressing circuit 15 to indicate selected size.In this embodiment, addressing circuit also receives the information about the length that is stored in the line segment in the storer, perhaps the offset information between the memory location that begins about line segment.In addition, provide the X of piece, Y address.Addressing circuit 15 control shift amounts also carry out addressing to allow to export at different access port one 7 places the pixel value of different line segments.For this reason, addressing circuit 15 is controlled shift amount and is carried out addressing according to the length of the mode signal and the line segment of storing.
Having indicated by cross and triangle respectively among the figure will be at the pixel value of the respective image line of access port 17 concurrent accesses.Shown in the example among Fig. 7 a, and line output is from 4 pixel values of a line in the piece and visit two pixel values of next line.Shown in the example among Fig. 7 b, and line output is from 3 pixels of 3 pixel values and next line of a line in the piece.
The situation of memory location that equals every row of memory buffer 14 with the number of institute storage pixel position is compared, and the shift amount of this type of access and addressing are more complicated.Under the situation of Fig. 7 a, for example, addressing circuit 15 can use 0 and 2 shift amount.At the row address that preceding two access ports produce second row is carried out addressing by addressing circuit 15, be used for the 3rd address first row is carried out addressing to the 6th access port.Under the situation of Fig. 7 b, for example, addressing circuit 15 can use 2,1 and 1 shift amount to preceding two row and the third line.At the row address that preceding 3 access ports produce first row is carried out addressing by addressing circuit 15, addressing is carried out to second row in the address that is used for ensuing two access ports, and addressing is carried out to the third line in the address that is used for last access port.
Just as is understood, compare with the image line example that same position begins in every row of memory buffer 14, addressing mode in this case is more irregular.But for the configuration of the line segment length of image line segment in any given block size, block address and buffering storer 14, selecting to allow concurrent access is simple from the shift amount of the pixel value of a plurality of lines and the combination of address.Those of ordinary skill in the art can select the combination of this shift amount that easily is tied, so that have mutually on the same group the result not who required location of pixels is displaced to access port.So those of ordinary skill in the art can provide the LUT circuit or be convenient to export any other circuit of control signal, response is about the information of block size, block address and line segment length, and this control signal influences the selected combination of shift amount and address.Just as is understood, this technology can be used for supporting the piece of Any shape, and is not only rectangular block.
Although utilized certain embodiments to describe the present invention, it should be understood that to the invention is not restricted to these embodiment.For example should be able to realize using than the bigger buffer memory size of 4 * 4 location of pixels.Memory buffer 14 also is not limited to the storer of square window.
In addition, although the memory buffer 14 that comprises a pixel value at each memory location has been described the present invention, it should be understood that and in each position, to store more than a pixel value, for example store one group of four pixel value for location of pixels continuous on the directions X.In this case, preferably, this group defines the basic granularity of visit, at group, controls to the transmission of access port 17 jointly, and the pixel value of visiting this group at same access port 17 places together.Certainly, this will limit addressing and block size is the integral multiple of granularity, yet this is acceptable for image processing operations in great majority are used.
In addition, although described the use circular shift register, it should be understood that and selectablely can use non-circular shift registers (promptly register is not shifted at the pixel value in left side when these pixel values are moved out of on the right side, and vice versa).In this case, non-linear shift register is preferably wideer than the row of window, when pixel value is shifted to the left or to the right, has the additional position that is used to receive from these pixel values of window, therefore, when data are shifted like that back and forth according to different mode is desired, do not lose from the pixel value of window.But it should be understood that when the circular shift register time spent, can use littler shift register.In this case, the division that piece also can occur presents (split presentation).
Addressing circuit 15 can be implemented as the part of treatment circuit, perhaps the part of the larger process circuit of each treatment circuit set instead.Like this, can utilize the instruction in this treatment circuit to come controll block to select and addressing.Selectively, addressing circuit 15 can integrate with memory buffer 14.In addition, the partial function of control circuit 18 can be integrated in the memory buffer 14 and/or with addressing circuit 15 and integrate, for example when detecting when primary memory 10 uses according to the address and/or get new data from the information about the addressing in future of treatment circuit 16.
Although the present invention has described and has been mainly used in wherein visit and relates to from the embodiment of Buffer Pool storer 14 read pixel values, it should be understood that selectively the present invention can be applied to that a write or read is write or be read-only.This only influences the direction at access port 17 and the signal flow supported between the register of shift register 30,40 or register 50.Writing on multiplex circuit 32,54 functions under the strict situation, realizing these multiplex circuits if use to switch to connect as the demultiplexing circuit, then as broad as long.

Claims (13)

1. image processing circuit has the memory buffer (14) of the pixel value of the location of pixels that is used for being stored in the two-dimentional movable window of image, and described memory buffer (14) comprising:
A plurality of function row of memory circuitry (30) are used to store the pixel value from this window;
A plurality of access ports (17), each is used for providing to the visit from the addressable pixel value of the respective sets of corresponding line pixel value;
Shift circuit (32), each is used for corresponding line, and the distribution of the pixel value from corresponding line to group that is configured to be shifted;
Addressing circuit (15), comprise input, be used to receive the mode signal of the size of the address of two-dimensional block of location of pixels and this piece of indication, addressing circuit (15) is configured to depend on by the indicated size of mode signal, control shift circuit (32) is provided with the corresponding shift amount of corresponding line, and shift amount satisfies such condition: the pixel value of homologous lines that is stored in the location of pixels of the piece in the different rows is assigned to the group of non-overlapping copies.
2. image processing circuit according to claim 1, wherein the pixel value memory circuitry (30) of shift circuit (32) and each row forms corresponding shift-register circuit.
3. image processing circuit according to claim 2, wherein addressing circuit is provided in before the visit and afterwards, is shifted back and forth according to the shift amount in the shift register respectively.
4. image processing circuit according to claim 2, wherein addressing circuit (15) comprises storage unit (60), be used to represent the last shift amount that uses of row, addressing circuit is configured to carry out the difference displacement, comes the shifted pixels value by the corresponding distance of difference between shift register basis and the connected reference number of times.
5. image processing circuit according to claim 2, wherein shift-register circuit (40) comprises multiplexer (44), each multiplexer has the output that is connected to respective memory circuit (42) and is connected to the input of the output of memory circuitry (42) from described respective memory circuit along shift-register circuit (40) with distance separately, and addressing circuit (15) is connected to multiplexer circuit (44) and selects to depend on capable shift amount control input.
6. image processing circuit according to claim 2, wherein for every row, shift circuit (32) is connected between memory circuitry (30) and the access port (17), and is configured to a plurality of displacements connections between the memory circuitry (30) are offered access port (17).
7. image processing circuit according to claim 2, wherein shift register is a circular shift register, each be configured to from the distribution ring shift of the pixel value of corresponding line to described group.
8. image processing circuit according to claim 1, wherein addressing circuit is configured to produce respective row address at respective sets, and depend on mode signal and select the row of addressable pixel value from group, for described group, be provided to described group visit by access port (17), such condition is satisfied in described address: for the nonoverlapping group of different row of difference addressing.
9. image processing circuit according to claim 1, comprise primary memory (10) and be connected primary memory (10) and memory buffer (14) between interface circuit (12), and be provided in the pixel value that transmits this window between primary memory (10) and buffering storer (14).
10. image processing circuit according to claim 1 comprises a plurality of parallel pixel value treatment circuits (16), and each parallel pixel value treatment circuit is connected to the respective subset of corresponding access port (17) or access port (17).
11. image processing circuit according to claim 10, wherein pixel value treatment circuit (16) is programmed to the piece for different size, visit is from the pixel value of the same pixel position of this window, and the mode signal of the size of the piece between the transmission indication visit.
12. method of handling the pixel value of location of pixels in the image, use is used for the memory buffer (14) of pixel value of the location of pixels of the two-dimentional window of memory image, described memory buffer (14) comprises a plurality of access ports (17) that are used for the concurrent access pixel value, each access port is provided to the visit from the addressable pixel value of the respective sets of the capable pixel value of corresponding function in the memory buffer (14), and described method comprises:
Transmit a signal to memory buffer (14), want the size of pixel location blocks of the pixel value of concurrent access with indication;
For corresponding line, to be set in the group from the corresponding shift amount of the distribution of the pixel value of corresponding line, described distribution depends on indicated size, such condition is satisfied in described distribution: according to shift amount, the pixel value of homologous lines that is stored in the location of pixels of the piece in the different rows is assigned to the group of non-overlapping copies;
Pixel value by access port (17) concurrent access homologous lines.
13. method according to claim 12, wherein the function row respectively comprises shift register (40) separately, memory location (42) with respective pixel values, described method comprise by shift register (40) pixel value are shifted to realize described shift amount.
CNA2006800251323A 2005-05-10 2006-05-04 Image processing circuit with block accessible buffer memory Pending CN101218604A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP05103871 2005-05-10
EP05103871.9 2005-05-10

Publications (1)

Publication Number Publication Date
CN101218604A true CN101218604A (en) 2008-07-09

Family

ID=37086103

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2006800251323A Pending CN101218604A (en) 2005-05-10 2006-05-04 Image processing circuit with block accessible buffer memory

Country Status (4)

Country Link
EP (1) EP1882235A2 (en)
JP (1) JP2008541259A (en)
CN (1) CN101218604A (en)
WO (1) WO2006120620A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI637345B (en) * 2017-10-20 2018-10-01 (中國商)上海兆芯集成電路有限公司 Graphics processing method and device
CN110610679A (en) * 2019-09-26 2019-12-24 京东方科技集团股份有限公司 Data processing method and device

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5552692B2 (en) 2009-02-20 2014-07-16 インテル・コーポレーション Multimode accessible storage device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6933970B2 (en) * 1999-12-20 2005-08-23 Texas Instruments Incorporated Digital still camera system and method
US6900811B2 (en) * 2001-01-18 2005-05-31 Lightsurf Technologies, Inc. Programmable sliding window for image processing
US6720969B2 (en) * 2001-05-18 2004-04-13 Sun Microsystems, Inc. Dirty tag bits for 3D-RAM SRAM

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI637345B (en) * 2017-10-20 2018-10-01 (中國商)上海兆芯集成電路有限公司 Graphics processing method and device
CN110610679A (en) * 2019-09-26 2019-12-24 京东方科技集团股份有限公司 Data processing method and device

Also Published As

Publication number Publication date
WO2006120620A3 (en) 2007-03-08
JP2008541259A (en) 2008-11-20
EP1882235A2 (en) 2008-01-30
WO2006120620A2 (en) 2006-11-16

Similar Documents

Publication Publication Date Title
US7941634B2 (en) Array of processing elements with local registers
JP2021528764A (en) Neural processor
US9268746B2 (en) Architecture for vector memory array transposition using a block transposition accelerator
WO2001035224A1 (en) Bit-serial memory access with wide processing elements for simd arrays
CN101093474B (en) Method for implementing matrix transpose by using vector processor, and processing system
CN101014948A (en) Interconnections in simd processor architectures
CN101427264A (en) Method and apparatus for high density storage and handling of bit-plane data
CN101558649A (en) Data processing with a plurality of memory banks
CN112991142A (en) Matrix operation method, device, equipment and storage medium of image data
CN101218604A (en) Image processing circuit with block accessible buffer memory
KR20030014023A (en) A conflict-free memory system and a method of address calculation and routing via the system
CN102112983A (en) SIMD parallel processor architecture
US7783861B2 (en) Data reallocation among PEs connected in both directions to respective PEs in adjacent blocks by selecting from inter-block and intra block transfers
CN101399978B (en) Reference frame data reading method in hardware decoder and apparatus thereof
CN101573688A (en) Methods and apparatuses for compaction and/or decompaction
EP2024928B1 (en) Programmable data processing circuit
CN101470600B (en) Method and apparatus for processing very long instruction word
JP3593439B2 (en) Image processing device
JPS63231798A (en) Two-dimensional shift register
CN101996142B (en) Method and device for accessing storage
CN110087088B (en) Data storage method based on motion estimation, terminal equipment and storage medium
CN116150046B (en) Cache circuit
US20050138326A1 (en) SIMD type parallel operation apparatus used for parallel operation of image signal or the like
JPH0765180A (en) Data transfer controller
CN104951280A (en) Vliw processor

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20080709