GB2216301A

GB2216301A - Bit-block transfer

Info

Publication number: GB2216301A
Application number: GB8903168A
Authority: GB
Inventors: Sakarin Suwannukul
Original assignee: Hercules Computer Technology
Current assignee: Hercules Computer Technology
Priority date: 1988-02-19
Filing date: 1989-02-13
Publication date: 1989-10-04
Also published as: GB8903168D0; KR890013562A; DE3837847A1

Description

221630 METHOD AND APPARATUS FOR BIT BLOCK TRANSFER The present invention

is directed generally to data relocation in a random access memory, and more particularly to the high-speed relocation of blocks of data in a bit mapped graphics system.

In bit mapped graphics display systems, each dot or pixel on the display screen is represented by associated data stored in what is typically called a frame buffer.

A typical operation in such a system is the moving of contiguous blocks of dots, or pixels, from one point in the screen to another It is also typical that the internal architecture of the system performs such an operation by addressing the frame buffer on a word-by-word basis That is, the frame buffer is arranged so that each frame buffer address represents not just one, but a collection of pixels In a system where a word of data comprises eight bits, i e, a byte of data, data representing eight pixels will be read from the frame buffer or written into the frame buffer each time the frame buffer is addressed.

As is often the case, the block of dots or pixels sought to be moved has its boundary somewhere within a word or byte of data, i e, at a position which is offset from the word or byte address or boundary in the frame buffer.

In order to use the word-wise addressing scheme of the frame buffer architecture, but still accommodate the occurrence of a block boundary at a position offset from a word or byte address or boundary, the traditional approach is to read an entire byte of data, shift or rotate the data within the CPU according to a bit index, and then write the rotated data into the desired destination location in the frame buffer.

The speed at which such operations can occur is often attributed to the bandwidth of the frame buffer.

Hence the lack of bandwidth in a frame buffer is most often blamed for the lack of performance of the frame buffer However, an analysis of software which is typically utilized in controlling the frame buffer during a data relocation operation reveals that ten lines or more of code are executed by the central processing unit ("CPU") during each access of the frame buffer The execution of this code is an overhead that remains even though the frame buffer itself can be accessed in an insignificant amount of time.

On closer observation of the above code, it can be seen that it deals mainly with keeping track of bit locations when the source and destination locations of the block of information, relative to the word boundaries in the frame buffer, are different.

For example, assume that the block of information to be relocated begins at other than a word boundary, and also that the block boundary at the destination location starts at a different bit position within a word In such a situation the first word from the source location can contain more bits of data, or fewer bits of data, than is to be accommodated by the first word at the destination location For example, if the first word read from the source has five pixels to be transferred, but the first destination word is intended to hold only two of them, three pixels from the source word need to be carried over for use in the next read operation Assuming that there are more pixels to be transferred, the next read acquires eight more pixels from the frame buffer for transfer However, five of these new pixels and the three pixels that were carried over from the previous read need to be combined for storage in the next word location at the destination location More particularly, the three pixels carried over from the previous read need to have their bit positions rotated backwards so that they occupy the first three bit positions in the new destination word.

Then the five new pixels need to have their bit positions rotated forward so that they occupy the remaining five bit positions in the new destination word Further, three of the new pixels need to be carried over to the next read operation In all, in the second read operation, eleven pixels were carried and rotated by the software so that the oldest of the eight pixels could be transferred and the three newest of the pixels could be carried over to the next read operation.

Each of these steps involves the fetching of code and the execution of such code by the CPU, thereby consuming valuable time Making matters worse is that the above-operations are often embedded within an inner loop which means that the inefficiency of such operations are multiplied by the number of times the inner loop is executed.

It is therefore highly desirable to minimize the amount of software code required to be executed in order to perform a data relocation operation It is also desirable to minimize the need for shifting or rotating data prior to writing it into the new destination location.

The present invention significantly alleviates the above problems by allowing the host to access the frame buffer with what is effectively a floating word boundary The result is a much higher performance not only in terms of increase of speed, but also in terms of minimizing code size This is an important influence on the speed of a system in systems where executable code is swapped in and out of slower secondary storage.

In accordance with the present invention, the software code is relieved of any requirement to carry pixels Every transfer to or from the memory adds a word of pixels from the source location in memory to a pipeline, and pushes out a word of pixels for writing into the destination location in memory.

More particularly, the present invention is a method and apparatus for relocating a block of bit-wise data from a source location in a memory to a destination location in the memory, wherein data in the memory is organized and addressed in the form of words of a predetermined number of bits and the source and destination locations can occur at different bit positions within different words in the memory The apparatus comprises destination word means receiving and storing bit-wise data for forming the bit-wise data into a destination word to be written to the destination location; holding means receiving and storing bit-wise data for subsequent use in the destination word means, and word splitting means communicating with the memory, the destination word means, and the holding means, for allocating the bit-wise data received from the memory to the destination word means and to the holding means, and for allocating bit-wise data received from the holding means to the destination word means so that the bit-wise data is positioned at the proper bit positions within the destination word without the need for shifting and rotating such bit-wise data.

In accordance with the method of the present invention, the block boundary of the block of data to be transferred from the source location is compared with its destination position, all relative to word boundaries in the memory This comparison yields a quantity which is utilized to determine which bits from a word of data coming from the source location are to be combined with bits held over for use in the just previous read operation, and which bits are to be held over from the next read operation Further, the quantity indicates the order in which the bits from the word are to be used This degree quantity remains constant while the data in the block is split and combined and then written to the destination location.

When a block boundary is encountered, a protection mask is utilized to preserve the integrity of background data at the destination location which data is adjacent to, but falls outside of, the destination location.

In the above manner, a fast block data transfer can be realized with a minimum of CPU participation and without shifting or rotating of the data being relocated Further, the present invention accommodates the relocation of bits from left to right as well as right to left.

While it is believed that in the disk-drive art a form of splitting of blocks of data is performed at the sector level, it is also believed that heretofore no attempt has been made to split up words of data at the bit position level in a manner which provides in effect a floating word boundary.

It is therefore an object of the present invention to provide high performance control of data movement in frame buffers.

It is another object of the present invention to provide a method and apparatus for moving a block of data in memory which employs the splitting of data from each word in the block of data being moved between holding means and destination word formation means.

It is a further object of the present invention to provide a method and apparatus for moving a block of data in memory without shifting or rotating of the data.

It is a still further object of the present invention to provide in a graphics display system a method and apparatus for moving a block of data in memory which requires a minimum of participation of the central processing unit of the graphics display system.

It is still another object of the present invention to provide a method and apparatus for moving a block of data in memory in which an allocation degree quantity is determined from the starting and ending address of the source location and destination location block boundaries, and from the word size being used in the system, where such quantity determines the manner in which data retrieved from memory is allocated between destination word formation means, for forming of the word to be written into the destination location, and holding means for use in forming subsequent words for writing to the destination location.

These and other objectives, features and advantages of the present invention will be more readily appreciated upon consideration of the following detailed description of the invention and accompanying drawings.

Figure 1 is a chart illustrating the relocation of a block of data from a source location in memory to a destination location in memory.

Figure 2 is a conceptual block diagram of the present invention.

Figure 3 is a flow diagram illustrating the operational flow of data in accordance with the present invention.

Figure 4 is an illustrative example of the present invention utilizing the information of Figure 1.

Figures 5 A and 5 B are illustrative examples of the present invention in which a block of bit-wise data is moved in a right to left direction.

Referring to Figure 1, there is illustrated a transfer of a block 10 of data from a source location, starting at bit position 2 in a word, to a destination location which begins at bit position 4 within a different word in memory Further, the transfer of a different block 12 of data is also illustrated.

In the figure, B's represent background data while

I's and J's represent image data; i e the data in the blocks of data to be transferred Further, S's and U's represent existing data at the destination location No attempt will be made to provide a detailed description of the operations involved in reading, writing and addressing of the memory, such operations being conventional and well known.

From Figure 1 it can be seen that when the first word is obtained from the source location, the first six bits of the block of bits to be transferred will have been retrieved from memory It can also be seen that at the destination location, only four of the bit positions in the first word will accommodate bits from the block of bits being relocated This is because the block boundary at the destination location relative to the word boundary is shifted over by two when compared with the block boundary at the source location.

In the typical graphics system, the six bits from the source location would be rotated by the CPU so that the first four bits are shifted into the proper bit positions relative to the word boundaries at the destination location The next word is then retrieved from the source location, combined with the bits unused from the previous read operation, rotated, then written into the destination location Such operations continue until all of the data within the block has been written into the destination location It should therefore be clear that such operations require a significant amount of CPU execution time and lines of code to accomplish.

Referring to Figure 2, a functional block diagram of the present invention will now be described Memory stores the frame of information It is to be understood, that while the following description will be in terms of a single plane of memory, the present invention applies equally well to multiple planes of memory, such as in a color graphics system.

Data read from memory 15 is provided on bus 14, also called the A bus, to the apparatus of the present invention Data to be written into memory 15 is received on bus 16 Read/write enable control signals are received from a CPU 18 via lines 20 Memory 15 is addressed by CPU 18 via address bus 22 CPU 18 also supplies a direction flag on line 24, mask protect signals on bus 26, and allocation degree information on lines indicated by dashed lines 28 The user supplies source and destination location coordinate information to the CPU 18, as well as block size information, on lines 29.

The present invention includes a destination register 30 and a source register 32 The destination register is used to form words which are to be written into the destination location, while the source register 32 operates as a holding register to hold data from the source location for use by the destination register 30.

Destination register 30 receives its data from multiplexer 34 Multiplexer 34 operates to supply different combinations of newly retrieved data and data left over from the previous word formation operation.

The combinations of such data are selectable in accordance with the allocation degree quantity supplied by CPU 18.

More specifically, it can be seen from Figure 2 that in the present invention one of eight buses, A, V 1, V 2, V 3, V 4, V 5, V 6, and V 7, can be selected in accordance with the allocation degree quantity Each of these buses provides a different combination of the data received from multiplexers 36 and 38 These different combinations are provided by cross connect block 40.

Cross connect block 40 can be a set of hardwired paths, although other combining techniques such as multiplexers can be used in accordance with the present invention The table contained within cross connect block 40 of Figure 2 illustrates how the bits of data received from multiplexer 36 and 38 are provided in different combinations and positions to the various buses V 1 through V 7 Bus A provides a direct path between memory 15 and the " O " input to multiplexer 34.

Multiplexers 36 and 38 receive data from memory 15, via bus A, and source register 32, via bus B More specifically, multiplexer 36 receives, at its input " 0," data read from memory 15, and, at its input " 1," data from source register 32 Selection between these sources of data for output to cross connect circuit 40 is determined by the direction flag from CPU 18 as applied to the multiplexer select inputs Similarly, multiplexer 38 receives data from source register 32 at its " O " input, and data read from memory 15 at its " 1 " input A direction flag state of " O " causes data applied at the " O " inputs of multiplexers 36 and 38 to be output, while a direction flag state of " 1 " causes data applied at the " 1 " inputs to be output In this manner, block moves of data can be performed from left to right as well as right to left with a simple change in direction flag In the event it is desired that block moves are to made only in the left to right direction, multiplexers 36 and 38 can be dispensed with and bus A connected directly to bus E and bus B connected directly to bus F.

Data is provided by multiplexer 36 to cross connect block 40 via bus E Such data are designated in the table of cross connect block 40 with the symbol E, and with a subscript indicating the bit position of the data within a word Similarly, data is provided by multiplexer 38 on bus F, and is designated with the letter F and subscripts indicating the bit position of the data with a word Multiplexers 36 and 38 serve to route data to cross connect block 40 in the same bit positions as presented to them.

CPU 18 also provides mask protect information to mask protect latch 44 This information is used to disable a write operation into specified locations of destination register 30 and to control multiplexer 34 when data is being read from a block boundary.

Multiplexer 42 supplies data to source register 32, which data originates from either memory 15, or from the output of source register 32 Selection between these data sources is a function of the state of mask protect latch 44 In the embodiment shown in Figure 2, if the contents of mask protect latch 44 is non zero, OR gate 46 will apply a logic 1 state to the select input of multiplexer 42, thereby selecting data from source register 32 for reinput to source register 32 This maintains the same data in source register 32 for use in the next word formation operation Conversely, if the contents of mask protect register latch 44 is all zeros, the data that originates from memory 15 will be supplied to source register 32 for storage there.

In the preferred embodiment of the present invention, multiplexer 42 is dispensed with and, instead, the signal from OR gate 46 is employed as a write inhibit signal to source register 32, with source register 32 receiving an input only from the read data line 14 Thus, when the output of OR gate 46 indicates that the content of mask protect register latch 44 is other than all zeros, source register will be write inhibited and the contents of source register 32 will be unchanged.

Returning to cross connect block 40, a more detailed explanation of its operation will now be provided Cross connect block 40 receives eight bits, E O E 7, from multiplexer 36, and eight bits, F O F 7, from multiplexer 38 To read the table in cross connect block 40, each column represents a different "VI' bus, and each row represents a bit position within the bus.

It is to be noted that while the embodiment in Figure 2 uses an eight-bit-wide word, other word widths can be used in accordance with the present invention Thus, bus V 1 routes bits F 1 through F 7 into its bit positions 0 6, and places bit E O in its bit position 7.

Likewise, bus V 4 places bits F 4 through F 7 into its first four bit positions, and places bit E O through E 3 into its last four bit positions As indicated earlier, the V bus selected is a function of the allocation degree information received from CPU 18 It is to be understood that the bit position assignments of cross connect block 40 utilize the convention of least significant to most significant bits from left to right.

When the convention of right to left for least significant to most significant bits is used, the bit position assignments should be modified accordingly.

The operation of the circuitry of Figure 2 will be more readily understood upon consideration of Figures 1, 3 and 4, and the accompanying example Figure 3 is a flow diagram illustrating the operation of the present invention in relocating blocks of data In order to more clearly illustrate the operation of the present invention consider the following example in which block in Figure 1 is relocated from its source position to the destination posit-ion 11 Figure 4 illustrates the contents of the source register 32 and destination register 30 during the relocation of the information in block Lo to block 11 in Figure 1, and also during the relocation of the information in block 12 to block 13 in Figure 1.

Determination of Allocation Degree In accordance with the present invention, data from the memory 10 and from the source register 32 is allocated to the destination register 30 in accordance with a quantity which will be referred to as allocation degree This quantity is a function of the starting address of the block of data at the source location, the starting address of the destination location, as well as the word width being used, i e the number of bits in a word More specifically, a number of quantities are defined as follows:

XS = X-coordinate of the source location; XD = X-coordinate of the destination location; RS = XS modulo w, where "w" is the word width; RD c XD modulo w, where "w" is the word width; K = RS RD, ignoring overflow, and expressed in two's compliment; and Direction = 0 for forward transfer (left to right), 1 for reverse transfer (right to left).

SI The quantity RS can be viewed as the bit position offset of the source location boundary from a word boundary, while RD can be viewed as the bit position offset of the destination location boundary from a word boundary The quantity "K" is that which defines the allocation degree More specifically, in order to determine "K" one examines the log 2 (w) least significant bits of the quantity RS RD, expressed in two's complement For example, if RS RD -1, K = 7 As another example, if RS RD = -5, K = 3 All of the above assumes a word width of eight bits.

Referring now more particularly to the example in Figure 1, it can be seen that RS = XS modulo 8 = 2, and RD = XD modulo 8 = 4 As such, RS RD = -2 = 1110 ( 2 's complement) Log 2 ( 8) = 3, therefore, K equals the three least significant bits of RS RD; i e K = 6.

Thus, the allocation degree for this relocation is 6.

Assume also that the direction of transfer is from left to right so that the direction flag will be 0.

Determination of Mask Protect The mask protection information is defined by examining the block boundary at the destination location and determining the number of bit positions in the word in which the block boundary is found which will be taken up by the block of bits being relocated In the example, the last four bit positions will be taken up, thus the mask protection information will operate to protect the last four positions in the word As will be described in further detail hereinbelow, during the formation of the destination words at the block boundary, original data is read from the location to which the destination word is to be written This data is then written into destination word register 30, but with those bit positions holding data from the source location being protected by the mask protect information.

Transfer of Data at the Block Boundary As can be seen in Figure 3, after the allocation degree and direction are specified in step 48, mask protect latch 44 is cleared in step 50 Next, in step 52, it is determined whether the word currently being read from memory 15 includes a boundary of the block 10 being relocated If so, the next word from the source location is read, and selected ones of its bits are allocated to the destination register 30 in accordance with the allocation degree quantity Step 54.

Before proceeding further, a determination is made in step 55 whether the operation is at the start of the block being moved If not, this means that the operation is at the end of the block boundary and step can be executed immediately.

If the operation is at the start of the block to be moved, step 56 determines the direction in which the move is to take place If the move is from left to right, step 57 is executed, if it is from right to left, step 58 is executed In steps 57 and 58 it is determined whether a second read is needed to acquire enough bits for the destination location.

More particularly, step 57 determines whether there are more bits, in the word coming from the source location, than can be accommodated by the bit positions available in the destination location If RS is greater than RD, this indicates that the bit positions available at the destination location require bits from the word currently being read from the source location as well as from the next word to be read from the source location.

Conversely, if RS is less than or equal to RD this means that there are enough, or more bits available from the source location than can be accommodated by the bit positions available at the destination location Thus, if in step 57 it is determined that RS is greater than RD, step 59 is executed in which a second read is undertaken to obtain the additional bits from the source location needed to complete the word for the destination location If in step 57 RS is greater than RD, no additional read operation is needed before proceeding to a step 60.

Similarly, step 58 determines, for the right to left direction of movement, whether a second read is needed to provide the bits needed to complete the word for the destination location Because the direction of movement is right to left, the condition that RD is greater than RS will indicate that a second read, step 59, is needed.

The above steps can be more readily appreciated upon reference to Figures 4 and 5 B These figures include three columns, the first indicating time, the second indicating the contents of source register 32, and the third column indicating the contents of destination register 30 In Figure 4, at time To, which corresponds to step 54 in Figure 3, the first word from the source location is read from memory 15 As determined earlier, the allocation degree quantity is 6, thus the data on bus V 6 will be transferred by multiplexer 34 to destination register 30 This means that the data in the last two bit positions from multiplexer 38 will occupy bit positions 0 and 1 out of multiplexer 34, while the data in the first six bit positions from multiplexer 36 will occupy bit positions 2-7 out of multiplexer 34 Note that data from multiplexer 38 originates from source register 32, hence the last two bit positions out of multiplexer 38 originate from source register 32.

Since this is the first word being retrieved from the source location, source register 32 can have all zeros in it; in other words, these are "don't care" bits Thus, the last two bit positions of source register will be zeros which will cause the first two bit positions out of multiplexer 34 to be zeros Note also that the data out of multiplexer 36 are taken from the first word being read from the source location in memory 15 Thus the first six bit positions from - multiplexer 36 will be routed to bit positions 2-7 out of multiplexer 34, and will correspond to the sequence B, B, Ia' Ib, Ic, and Id Further, the complete word being read from memory 15 is stored in source register 32 for use in the formation of the next destination word.

The above is an example of when RS is less than RD.

The sequence shown at time slots T 6 and T 7 in Figure 4 illustrate the situation when RS is greater than RD.

In this example, a block of bit-wise data 12, Figure 1, will be moved to a destination location indicated by block 13 The source location has an RS equal to 7, while the RD for the destination location equals 6 The allocation degree for this example is 1 Thus, data coming out of cross connect block 40 on bus V 1 will be supplied by multiplexer 34 to destination register 30.

At time slot T 6, it is assumed that source register 32 contains all zeros The first word read out of source location 12 includes all background data bits except for bit position 7, which includes the bit Ja.

Allocation of the contents of source register 32 and the data being read from memory 15 results in a word stored in destination register 30 including all zeros except for a background bit in bit position 7 In time slot T 7 the next word from the source location is stored in source register 32 and allocated to destination register More specifically, the data in bit position 0 of the next word from the source location is placed in bit position 7 of destination register 30 Further, the data that was stored in source register 32 at bit positions 1-7 are placed in bit positions 0-6 of destination register 30 Thus, in time slots T 6 and T 7 steps 54, 55, 56, 57 and 59 of the flow diagram of Figure 3 have been executed.

Figure 5 B deals with the movement of block at source location 10 A to destination location l OB, Figure A, from right to left These involvesteps 54, 55, 56, 58 and 59 of Figure 3 From Figure 5 A, it can be seen that RS is equal to three ( 3), and that RD is equal to six ( 6) Thus, data from the next word of the source location will be needed to complete the first word of the destination location.

In Figure 5 B, time slot To, the first word at the right-hand boundary of the block at the source location is written to the source register 32 The bit-wise data in this word is also allocated to the destination register 30 according to bus V 5 of cross connect table 40, Figure 2 More particularly, the data in bit positions 5 through 7 of the first word are written into bit positions 0 through 2 of the destination register 30 Bit positions 3 through 7 of destination register are "don't cares" at this point.

Since RD is greater than RS, a the next word from the source location is read and allocated in time slot T 1 This corresponds to step 59 in Figure 3 More particularly, the data in bit positions 5 through 7 of this next word are written to bit positions 0 through 2 of the destination register 30, and the data in bit positions 0 through 4 of the source register 32 are written to bit positions 3 through 7 of the destination register 30.

In the examples provided so far,' the first word of the block is dealt with This means that provision should be made to protect the integrity of the adjacent background data at the block boundary in the destination location As can be seen from Figure 1, the block boundary at destination location 11 for the block of data 10 includes 4 bits of background information (So

53) in bit positions 0 3 of the first word.

Similarly, the destination location 13 for the block of data 12 includes 6 bits of background data (U O U 5).

In Figure 5 A, the block boundary at destination location 11 A includes 2 bits of background data (R 30 and R 31) in bit positions 6 and 7.

Steps 60 and 62 in the flow diagram of Figure 3 operate to provide such protection More specifically, in step 60 the mask protect information from CPU 18 is written into mask protect latch 44 For the example in time slot T 1 of Figure 4, it is to be noted that the destination location has four bit positions which are occupied by background information This means that the four remaining bit positions accommodate the data being relocated The mask protect information from CPU 18 protects these remaining bit locations and permits writing into the other bit locations.

Thus, in step 62, Figure 3, when the destination location in memory 15 is read, and the data is written into in destination register 30, only those bit positions corresponding to background information are actually written into destination register 30 Thus, in time slot T 1, Figure 4, it can be seen that the contents of destination register 30 include the first four bits of background information and the first four bits from the bit block 10.

In connection with time slot T 8 of Figure 4, it can be seen that the mask protect information from CPU 18 will protect bit positions 6 and 7 from being written into and permit bit positions 0-5 to receive the background bits from the destination location.

In connection with time slot T 2 of Figure 5 B, it can be seen that the mask protect information from CPU 18 will protect bit positions 0-4 from being written into and permit bit positions 5-7 to receive the background bits from the destination location.

Once the destination register has received all inputs, its contents are written into the destination location, see step 64, Figure 3 Next step 66 is executed in which an inquiry is made as to whether all of the data in the block has been transferred If so, the move operation is ended, step 68; and if not, the above procedure returns to step 50.

Once a block boundary has been dealt with, the remaining words in the block are pipelined through the circuitry of Figure 2 by way of steps 70 and 72 During this phase CPU 18 need perform only read and write operations When the other boundary of the block is encountered, step 52 will cause step 54 to be reexecuted Under these circumstances, step 55 will cause steps 56, 57, 58 and 59 to be bypassed The write mask protect will be redefined in step 60 for this other boundary Thus, in time slot T 5 of Figure 4 the mask protect information from CPU 18 will protect bit position 0 from being overwritten by data being read out of the destination location, while bit position 1-7 will not be so protected With respect to time slot T 11, bit positions 0 and 1 will be protected, while positions 2-7 will be permitted to be over written with the adjacent background information Similarly, in Figure 5 B, time slot T 5, bit position 7 will be protected, while positions 0-6 will be permitted to be over written with the adjacent background information.

In the above manner blocks of data can be quickly and efficiently transferred from one point in memory to another random point in memory Further, the above arrangement permits data to be moved from left to right or from right to left in basically the same fashion with basically the same steps.

It is to be understood that the central processing unit will be provided with the beginning and ending source and destination addresses and a specification of the width of the word used, and further that the CPU will be supplied with an indication of the direction in which the transfer is to take place.

It is to be noted that the circuity of the present invention is especially suited to operate with negative numbers expressed in 2 's complement form As such, no additional conversion of the allocation degree data from the CPU is needed Assuming a word width of 8 bits, the allocation degree information can be obtained by connecting the three least significant bits of the output from CPU 18, for the quantity RS RD, to the select input for multiplexer 34 by way AND gates 74.

Inverter 76 receives at its input the output of OR gate 46 The output of inverter 76 is supplied as an input to each of the AND gates 74 Thus, when mask protect latch 44 is non zero, AND gate 74 will be disabled and the data on bus A will be supplied directly to destination register 30 This occurs when the background data is being written into the destination register when the boundary point of the block are being processed.

While the above description of the present invention has been in terms of a single plane of memory 15, it should be apparent that the present invention can be easily extended to multiple planes of memory In such case, circuitry corresponding to destination register 30, source register 32, multiplexers 34, 36, 38, and 42, and cross connect block 40, will be supplied for each plane of memory The mask protect data, allocation degree data, address information, and direction flag will be the same for all such planes.

As can be appreciated from the above description, the present invention requires a minimum of intervention by the CPU 18 and also employs a minimum amount of hardware in its implementation For example, multiplexers 36, 38 and 42 can each be implemented with eight 2:1 multiplexers Multiplexer 34 can be implemented with eight 8:1 multiplexers.

It should also be appreciated that the present invention can accommodate rapidly changing block dimensions since only the allocation degree determination involves any appreciable CPU time.

Further, the present invention provides for protection of background bits in the vicinity of the block boundary in a simple and efficient manner.

All of the above contribute to a very fast, efficient block move capability which is useful in windowing and the graphics environment in general As display systems move toward larger screens and more colors, there will be an exponential increase in CPU overhead dedicated to block moves of data The present invention goes far toward reducing such demands on the CPU In the case of complex operating systems there is much more use of swapping code out of slower secondary memory It is believed that the present invention represents or permits the reduction in code size by a factor of ten As such graphics programs will be smaller, will run faster, and may even be resident in system memory at all times, rather than in secondary memory.

22 -

Claims

Claims

1 An apparatus under the control of a central processing unit for relocating a block of bit-wise data from a source location in a memory to a destination location in the memory, wherein bit-wise data is organized in, written to, and retrieved from, the memory in the form of words of predetermined width, and the source and destination locations can have boundaries at different bit positions within different words in the memory, comprising destination word means receiving and storing words of bit-wise data for forming the bit-wise data into a destination word to be written to the destination location; holding means receiving and storing bit-wise data for use in the destination word means; and word splitting means communicating with the memory, the destination word means and the holding means, for allocating bit-wise data in the words received from the memory to the destination word means and to the holding means, and for allocating bit-wise data received from the holding means to the destination word means so that the destination word is formed such that the bit-wise data from the source location is placed in the proper bit position within the destination word required for writing the destination word to the destination location.

2 The apparatus of claim 1, wherein the word splitting means include means responsive to the source location and the destination location for determining an allocation degree; and 23 - means receiving the bit-wise data from the memory and from the holding means and responsive to the allocation degree for routing selected ones of the bit-wise data from the memory and from the holding means to the destination word means in accordance with the allocation degree.

3 The apparatus of claim 2, wherein the allocation degree determining means include offset means for determining a bit position offset of the source location boundary relative to a word boundary and for determining a bit position offset of the destination location boundary relative to a word boundary; and degree determining means responsive to the bit position offsets of the source location and the destination location boundaries for determining the allocation degree according to a relationship in which the allocation degree is equal to the magnitude represented by a predetermined number of least significant bits of the difference in the bit position offsets, expressed in two's complement form, wherein the predetermined number is equal to log 2 (w), where W is the width, in bits, of a word.

4 The apparatus of claim 3, wherein a protect mask is supplied when the destination location boundary occurs within a word, and further wherein the routing means comprise cross connect means responsive to the bit-wise data from the holding means and from the memory for providing a plurality of predetermined combinations of the bit-wise data from the holding means and the memory; 24 - selection means, responsive to the plurality of predetermined combinations of bit-wise data and to the allocation degree, for providing a selected one of the plurality of predetermined combinations of bit-wise data to the destination word means in accordance with the allocation degree; and holding select means responsive to the bit- wise data from the memory and from the holding -means, and to the protect mask, for-routing the bit-wise data from the memory, or rerouting bit- wise data from the holding means, to the holding means as a function of the protect mask.

The apparatus of claim 4, wherein the cross connect means comprise hardwired circuitry.

6 The apparatus of claim 4, wherein the selection means comprise multiplexer means, responsive to the allocation degree and coupled to the cross connect means to receive the plurality of predetermined combinations, wherein the multiplexer means outputs a particular one of the plurality of predetermined combinations as selected by the allocation degree.

7 The apparatus of claim 3, wherein the allocation degree determining means are provided by the central processing unit under software control.

8 The apparatus of claim 4, wherein the central processing unit provides a direction flag which indicates the direction across the memory in which the relocation of the block of bit-wise data is desired to - occur, and further wherein the cross connect means include first input means for receiving a first word of bit-wise data arranged in predetermined bit positions; second input means for receiving a second word of bit-wise data arranged in predetermined bit positions; direction multiplexer means responsive to the direction flag and to bit-wise data from the memory and from the holding means for directing the bit- wise data from the holding means and from the memory to the first input means or the second input means as a function of the direction flag; and means coupled to the first input means and to the second input means for combining the bit-wise data from the first input means with the bit-wise data from the second input means to form the plurality of predetermined combinations, so that the relocation of the block of bit-wise data can proceed from left-to-right or right-to-left as a function of the direction flag.

9 The apparatus of claim 1, further including means for identifying when the word of bit- wise data being read from the source location in the memory includes a boundary of the block of bit- wise data being relocated; means for masking the bit positions in the destination word means which accommodate the bit- wise data from the source location so that the bit positions cannot be written into; and means for writing the contents of the destination location to the destination word means so that the destination word formed by the 26 - destination word means include the bit-wise data from the source location and background bit-wise data from the destination location adjacent to the destination location boundary.

The apparatus of claim 9, wherein the destination word means are responsive to a write enable signal for each of the bit positions therein, and the central processing unit provides write enable signals for each of the bit positions of the destination word means, and further wherein the means for masking include mask protect latch means coupled to the central processing unit and to the destination word means for storing the write enable signal from the central processing unit for each of the bit positions in the destination word means, and for applying the write enable signals to the destination word means.

11 The apparatus of claim 10, wherein the selection means includes direct routing means responsive to the write enable signal from the central processing unit for applying the bit-wise data being read from the memory from the source location directly to the destination word means whenever the write enable signals from the central processing unit indicate that a bit position of the destination word means is being protected from a write operation.

12 The apparatus of claim 11, wherein the direct routing means include logic means responsive to the content of the mask protect latch means for providing an r' 27 - indication whenever the contents of the mask protect latch means are non zero; gate means, responsive to the allocation degree from the central processing unit and to the logic means, for providing the allocation degree to the selection means and for overriding the allocation degree whenever the contents of the mask protect latch means are non zero.

13 A method for relocating a block of bit-wise data from a source location in a memory to a destination location in the memory, wherein bit-wise data is organized in, written to, and retrieved from, the memory in the form of words of predetermined width, and the source and destination locations can have boundaries at different bit positions within different words in the memory, comprising the steps of a determining a bit position offset of the source location boundary relative to a word boundary and the bit position offset of the destination location boundary relative to a word boundary; b determining an allocation degree quantity according to a relationship in which the allocation degree is equal to the magnitude represented by a predetermined number of least significant bits of the difference in the bit position offsets of the source location and destination location boundaries, expressed in two's complement form, wherein the predetermined number is equal to log 2 (w), where W is the width, in bits, of a word; c holding bit-wise data from a word previously retrieved from the source location in a source register; f I 28 - d retrieving a next word of bit-wise data from the source location; e allocating the bit-wise data held in the source register and the bit-wise data from the next word to a destination register in accordance with the allocation degree quantity; f storing the next word of bit-wise data in the source register; and g repeating steps "c" through "f" until all of the bit-wise data has been relocated from the source location to the destination location.

14 The method of claim 13, wherein the boundary of the destination location are to be located within a particular word so that it is offset from the word boundary, further including the steps of h determining which bit positions within the particular word correspond to the bit positions in the particular word which are to accommodate bit-wise data from the source location; i reading the particular word from the destination location; j writing the particular word to the destination register to all bit positions except those which are to accommodate bit-wise data from the source location; and k writing the resulting contents of the destination register to the destination location.

The method of claim 13, wherein step "b" further includes the step of providing the allocation degree quantity in two's complement format.

^ l 29 - 16 An apparatus for moving bits of data from a source location in a memory to a destination location in the memory, wherein the memory is organized in the form of words of a predetermined number of bits of data stored at designated word addresses so that all reading and writing into the memory is in the form of words, and further wherein source location can have a boundary at a bit position which is offset from a word boundary and the destination location can have a boundary at a bit position which is offset from a word boundary by an amount different from that of the source location boundary, the apparatus comprising means for determining an allocation degree quantity according to a relationship in which the allocation degree is equal to the magnitude represented by a predetermined number of least significant bits of the difference in the bit position offsets of the source location and destination location boundaries, expressed in two's complement form, wherein the predetermined number is equal to log 2 (w), where W is the width, in bits, of a word; destination register means for storing bits of data; source register means for storing bits of data; cross connect means responsive to the allocation degree quantity and communicating between the memory, the destination register means, and the source register means, for routing a combination of the bits of data from the memory and from the source register means to the destination register means to form a destination word, wherein the particular bits of data which are combined are a function of the allocation degree quantity; and - means for writing the destination word in the destination location of thememory.

17 Apparatus as claimed in Claim 1 substantially as herein described with reference to the accompanying drawings.

18 A method as claimed in Claim 13 substantially as herein described with reference to the accompanying drawings.

Publisbed 1989 at The Patent Office,State House, 6871 High Holborn,London W Cli R 4 TP Further copies maybe obtainedfrom The Patent Office ' Sales Branch, St Mary Cray, Orpington, Kent BR 5 3RD Printed by Multiplex techniques ltd, St Mary Cray, Kent, Con 11/87