WO2000024203A1 - Parallel processor for motion estimator - Google Patents
Parallel processor for motion estimator Download PDFInfo
- Publication number
- WO2000024203A1 WO2000024203A1 PCT/GB1999/003438 GB9903438W WO0024203A1 WO 2000024203 A1 WO2000024203 A1 WO 2000024203A1 GB 9903438 W GB9903438 W GB 9903438W WO 0024203 A1 WO0024203 A1 WO 0024203A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video
- processor
- parallel
- data
- input
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/523—Motion estimation or motion compensation with sub-pixel accuracy
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
- H04N19/43—Hardware specially adapted for motion estimation or compensation
Definitions
- This invention relates to video encoding and decoding, and in particular to the calculation of motion vectors in a video compression system such as MPEG-2.
- the MPEG-2 video standard is defined in ISO/IEC
- Video compression is achieved in a number of separate ways including intra-frame coding and inter-frame coding.
- Intra-frame coding reduces video data first by quantising discrete cosine transfer (DCT) coefficients of spatial data.
- DCT discrete cosine transfer
- VLC Variable Length Coding
- RLC Run Length Coding
- Inter-frame compression seeks to eliminate information which is redundant by virtue of it having been present in a past, or future image defined as an anchor frame.
- the anchor frame is a full resolution, full data picture.
- motion vectors are used to predict a present frame from an anchor frame.
- Motion vectors are assigned at a macroblock level and the predicted frame is subtracted from the actual frame to form a difference frame which has a much lower information context than the actual frame.
- the content of the difference frame will depend on the accuracy of the predicted frame.
- the predicted frame is developed from a IDCT quantised, decoded picture.
- Inter-frame prediction may be based solely on forward prediction from intra-frame coded images or other forward predicted frames, or be bi-directionally predicted from both a previous and a future intra-frame coded or forward predicted frame.
- Bidirectional coding necessarily means that the video input order must be changed so that the past and the forward anchor frames are known.
- the MPEG-2 standard provides a number of defined system configurations which are represented as levels and profiles as shown in table 1 below.
- the MPEG-2 standard is designed to be scalable, that is decoders and encoders do not need to be of comparable quality to work together. It is desirable to design motion estimation processors which use corresponding VLSI technologies for the corresponding MPEG profiles . Where possible it is desirable that the processors should be on a single chip. However, where this is not yet possible, for the highest profiles and levels, it is desirable to be able to operate a plurality of motion estimation processors in parallel.
- X,Y are the coordinates of the left upper corner of the anchor frame macroblock
- Z,G are the coordinates of the left upper corner of the current frame macro block
- (Z-X, G-Y) are the motion vector coordinates for the current macroblock being examined; and M,N are the macroblock dimensions in pixels.
- a half pixel precision search can be understood as being a linear interpolation of adjacent pixels.
- A,B,D,E represent pixels of the original luminance matrix and h,v,c and the two unidentified points represent half-pixels.
- the half pixels are calculated by the following linear interpolations :
- motion estimation requires the vectors of a number of macroblocks to be determined, and as video information is both spatial and temporal, parallel computing techniques are ideal for motion estimation.
- the architecture disclosed has the disadvantage that it only works with a given macroblock size and is not suitable for the processing with half pixel precision.
- the burst pipeline latency is such that a decrease of up to 50% in computational performance is possible.
- the architecture described has a high data bandwidth requirement as it has a large number of external ports for data input and output.
- This architecture is based on performing pipelined computations for a single row of pixels in a macroblock. This reduces pipeline latency and, potentially, can calculate motion vectors to half pixel precision by using four devices operating in parallel.
- the architecture has the disadvantage of a lower computational performance compared to the two-dimensional systolic matrix.
- US 5,636,293 discloses an architecture designed to increase the computational performance of the one-dimensional systolic matrix.
- a modular architecture is used which connects one-dimensional systolic matrices in tandem, allowing acceleration of calculations in the search window without increasing the number of data points.
- this architecture has the disadvantage that it does not provide half-pixel precision and computational performance is reduced as motion vectors for a single macroblock only can be searched for in the search window.
- US 5,719,642 discloses a systolic matrix with global links for anchor frame data input into the processing elements row of a single macroblock row processing architecture.
- increases in anchor frame data memory can achieve 100% exploitation of hardware.
- the computation performance is limited by the number of MxN processing elements which operate in parallel.
- the architecture of US 5,719,642 cannot calculate motion vectors with half-pixel precision.
- US 5,568,203 discloses an architecture in which the motion estimator inputs data serially into a matrix of shift registers and simultaneously loads in parallel the anchor frame pixel data into the MxN matrix of processing elements.
- the matrix of processing elements provides serial calculations of the full search algorithm (equation 1) .
- this architecture has the advantage of minimising the number of input and output ports, and fully utilizes hardware resources, it cannot calculate motion vectors with half-pixel precision.
- computational performance is impaired as only the MxN processing elements operate in parallel.
- US 5,030,953 discloses a matrix of signal processors, consisting of M parallel groups of sub- matrixes with N parallel operating processing elements, which calculate the sum of subtractions of absolute values for a single row of macroblocks being compared.
- the architecture effectively utilizes hardware resources and minimises the number of I/O ports but has restricted computational performance as it searches the motion vector of a single macroblock of the current frame and cannot calculate motion vectors with half-pixel precision.
- the invention aims to overcome or ameliorate the disadvantages with the systems described above.
- the invention provides for the simultaneous comparison of S current frame macroblocks with the nK macroblocks of the anchor frame.
- K is the number of macroblocks in the area of the anchor frame with the coordinates of the left upper corner, defined with single pixel precision
- 4K is the number macroblocks in the area of the anchor frame having the coordinates of the left upper corner corresponding to half-pixel precision.
- a parallel processor for estimating motion of a given portion of a current image frame with reference to a anchor frame comprising: an input for receiving current frame data; an input for receiving anchor frame data; a two-dimensional matrix of processing elements each for comparing a given area of the current frame with at least an area of the anchor frame wherein the matrix simultaneously compares S areas of the current frame with nK areas of the anchor frame, the matrix having dimensions of KxS and n being an integer; means for selecting from the comparison, for each area of the current frame, an area of the anchor frame corresponding to the area of the current frame; and means for outputting data identifying the selected areas of the anchor frame.
- Embodiments of the invention have the advantage of increasing computation performance by adding additional unitary modules without requiring any modification of the initial architecture or control signals, thus the system is truly modular. Furthermore, embodiments of the invention have the advantage that VLSI technology may be used to make individual devices which can calculate motion vectors for the various MPEG-2 levels and profiles and for video with any parameters.
- a preferred embodiment of the invention may have the advantage that half-pixel precision is achieved using the full anchor frame search by comparing pairs of current frame and anchor frame macroblocks .
- Figure 1 previously described, shows the movement of a macroblock between a past, present and future frame
- Figure 2 previously described, illustrates half pixel points within a given block of four adjacent pixels
- FIG. 3 is a block schematic diagram of the architecture of a motion vector processor embodying the invention.
- Figure 4 shows one of the processing elements of figure 3 in greater detail
- Figure 5 is an alternative realisation of the processing element of figure 4 for single pixel precision
- Figure 6 shows, in more detail, one of the parallel pipelined modules P of figure 5;
- Figure 7 shows, in more detail, one of the input modules of figure 3;
- Figure 8 shows, in more detail, the memory unit of figure 7;
- Figure 9 is a block diagram of the Bi module of figure 3;
- Figure 10 is a block diagram of the input B module oi figure 3;
- Figure 11 is a flow chart showing the steps in the anchor frame data priming process for generation of macroblock coordinates
- Figure 12 shows, in more detail, the READ F step in figure 11;
- Figure 13 shows, in more detail, the WRITE T step in figure 11;
- Figure 14 shows, in more detail, the WRITE F step in figure 11;
- Figure 15 is a representation of a anchor frame divided into stripes for processing
- Figure 16 shows an MPEG processor including a motion vector processor embodying the invention
- Figure 17 shows, in block schematic form, the architecture of a multipoint videoconferencing system or a DVD system including the motion vector processor of figure 16.
- Figure 13 shows, in block schematic form, the architecture of a videophone system including the motion vector processor of figure 16.
- Figure 19 shows, in block schematic form, the architecture of digital video camera including the motion vector processor of figure 16.
- Figure 20 shows, in block schematic form, the architecture of television or video encoder including the motion vector processor of figure 16.
- the architecture of figure 3 is based on the simultaneous comparison of S current frame macroblocks with K macroblocks of the anchor frame. This may be a portion of the anchor frame or the whole anchor frame depending on the picture size.
- the macroblocks are preferably 16 x 16 pel luminance pixel blocks although the MPEG 2 standard also supports 16 x 8 luminance pixel blocks or even 8 x 8 chrominance blocks .
- a plurality of K input modules 20 each receives anchor frame data Ih, Iv on respective inputs 22,24.
- the output from the Input modules 20(1) to 20 (k) is identified as PI(1) to PI(k) and represents a transformed version of the input data.
- the outputs PI(1) to PI (k) are supplied to a matrix of KxS processing elements 26 identified as PE1.1 to PEk.S in figure 3.
- Output PI(1) is supplied to the inputs of each of the processing elements in the row PE1.4, that is, elements PE1.1, PE1.2 and PELS in figure 3.
- Output PI (2) is supplied to each of the processing elements in the row
- PE2.X that is elements PE2.1, PE2.2 and PE2.S and so on, so that output PI(k) is input to elements Pek.l, Pek.2.... Pek.S as shown in figure 3.
- the macroblocks B of the current frame are input on an input IB to an Input Module 30 which receives them and distributes the current frame macroblocks to S buffers B, shown as 32 (1) ...32 (S) in figure 3.
- the output of each current frame macroblock buffer B is provided as an input to each processing element in a column.
- buffer Bl provides an input to processing elements PE1.1, PE2.1 and Pek.l and so on.
- each of the processing elements PEl.i to PEk.S are provided as inputs to a row of S comparator modules MIN 1 to MIN S identified by the numeral 34.
- the comparators are connected to each processing element in a column of the matrix.
- comparator MIN(l) receives at its input the output of processing elements PE1.1, PE2.1 and PEk.l and so on.
- the comparators 34 process the inputs to provide X,Y coordinates of matching anchor frame macroblocks for given current frame macroblocks.
- the X,Y coordinate is the upper left hand coordinate of the block.
- the comparators then pass this coordinate data to the output block 36.
- the element PEa.b has an input from current frame macroblock buffer Bb and an output to comparator MINb.
- the element receives comprises four identical parallel-pipelined processing modules 40 shown as Pc, Pv, Ph and PA which each have an output to a comparator MINP 42.
- Each of the parallel- pipelined processing modules 40 receives as its inputs, the output PB from the column macro block buffer, in this case PBb, and an Input PI from the row Input Module 22.
- the Input PI comprises four separate inputs Ic, Iv, Ih and IA which are input respectively to processing modules Pc, Pv, Ph and PA.
- the processing modules 40 perform parallel comparison cf a single macroblock of the current frame provided from buffer B with four interpolations of a macroblock cf the anchor frame having coordinates c,v,h and A as defined with reference to figure 2 earlier.
- the comparison is made with an anchor block having a given coordinate or coordinates off-set by a half-pixel in a horizontal, vertical or diagonal direction. It is the inclusion of these four pipelined processors in each processing element which gives the ability to estimate motion to half-pixel accuracy.
- Figure 5 shows an alternative processing element 26 a.b that is suitable where only a single pixel precision is required. It is identical to the element of figure 4 except that a single parallel pipelined Module 40 is required which receives a single input PI from the input module.
- a parallel-pipelined Module 40 is shown in more detail in figure 6.
- the module comprises M blocks AD 50 operating in parallel, each of which receive as an input the output from the column current frame macroblock buffer together with an Input I.
- the Input I is provided from the Input Module and will be described in greater detail later.
- the output of each block AD 50 is passed to an adder-accumulator 60 whose output is the input to processor comparator MIN 42 in figures 4 and 5.
- the AD units each carry out a series of arithmetic operations on the incoming data.
- the units each include a Subtractor 51 which subtracts the value of the current frame macroblock data from the anchor frame macroblock data, an absolute value Unit 52 which converts the output of the Subtractor to an absolute value, an accumulating adder 54 which adds the absolute value to the sum of earlier values, a first register 56 which holds the output of the adder 54 and whose output is fed back to the second input to the adder, and a second register 58 which receives the output of the first register 56 and thus the output of the accumulator adder.
- the blocks AD calculate the sum of absolute values of M differences with each block performing pipelined operation of sequential devices.
- the adder accumulator 60 receives the output of each second register 58 of each pipeline as an input to a multiplexer 62.
- the output of the multiplexer forms the input to an accumulator-adder 64 whose output forms the input to a first register 66 whose output is fed back to adder 54 to provide the second input.
- the outputs from the blocks 58 are summed and the output fed to a second register 68 whose output is the input to the comparator MINP 42.
- the comparator MINP 42 of each processing module sequentially compares the sums provided fro each of the modules Pc, Pv, Ph, PA for the current frame macroblock and in its most simplistic form, defines with half-pixel precision the coordinates of the anchor frame macroblock which has the smallest partial sum. It will be understood that the macroblock with the smallest partial sum is that which corresponds most closely to the current frame block under consideration. In many applications it will be more appropriate to set a threshold for the comparison. Higher thresholds may be set. As the threshold increases so too does the likelihood that there will be more than one coordinate value which will reach that threshold value. In that case the MPEG 2 standard provides that the decision may be made on the basis either of the first macroblock within the threshold value or the smallest value of all.
- a macroblock provides no coordinate value within the threshold, as may be the case, for example, where there is a scene change, that macroblock is intraframe coded and the remaining macroblocks are interframe coded. This means that the bit rate reduction process is not abandoned purely because one block cannot be matched.
- pipelines AD could be implemented in a variety of other ways.
- the module comprises the anchor frame buffer 70, shown as Memory Unit I in figure 7 and M processing blocks 72 SI to Sm together with an adder 74 and a delay line 76.
- the anchor frame buffer 70 is controlled by a control unit 78.
- the purpose of the processing blocks 72 is to provide from the input data the necessary additional data to perform calculations with half pixel precision.
- the processing blocks S 72 provide the Ic, Iv, Ih, IA data inputs to the parallel pipelined processing modules 40 of the processing elements. Again it will be understood that if the embodiment of figure 5 is adopted without half- pixel precision, the processing blocks of figure 7 are not necessary.
- Luminance data Y corresponding to these points is the input to processing modules 40 as mentioned above.
- Each of the blocks 72 comprises a delay L 80, an adder Sh 82 with delay Lh 84, an adder Sv 86 with delay Lv- 88 and Lv 2 90 and adder Sc92.
- Adder Sh82 performs the horizontal interpolation of equation (2) being half the sum of luminance pixels A+B in figure 2 and thus the delay 84 is of a length equal to the pixel period.
- the output of adder 82 is the luminance value at point h.
- Adder Sv performs the vertical interpolation of equation (3) being half the sum of the luminance pixels A+D in figure 2.
- Adder Sc 92 performs the central interpolation of equation 4 to calculate the luminance at point C in figure 2.
- Delays L, Lh and Lv 2 all provide timing adjustment for data output on the bus PI.
- the outputs Ic, Iv, Ih and IA are comprised of lines Ic 1 ,Ic 2 ... IcM etc, with one line being provided by each of the blocks SI, S2...SM.
- the input module takes the anchor frame data and forms the A,h,v and c data for each of M inputs.
- the A value is a simple delayed version of the input whereas h,v and c are obtained by performing equations (2), (3) and (4) as described in relation to figure 2.
- the additional adder Sv 74 and delay Lv, 76 shown in figure 7 are required as the value h relative to the last Pixel A to be calculated requires knowledge of the next Pixel B. This is provided by output M+l from the buffer 70.
- Figure 8 shows the input buffer 70 of the input module in more detail.
- Data inputs Ih, Iv are provided to first and second data registers 100, 102. Data from these registers is transferred to a multiplexer ID4 according to an anchor frame data priming algorithm which will be described.
- the multiplexer outputs data to a plurality of M+l two part memory blocks II to IM+1 106 which store M+l columns of anchor frame data.
- the output of the multiplexer and the memory blocks 106 are both controlled by signals AR, AWT, AWF from the Control Unit 78 (figure 7) .
- Data is output from the memory blocks to a switch matrix MXI .1-MXI .M+l 108 having M+l inputs and M+l outputs.
- the output of the Switch Matrix is the M+l lines to the M processing blocks S of figure 7.
- the control unit 78 in figure 7 operates according to the anchor frame data priming algorithm and generates the anchor frame macro block coordinates which are sent to the processing elements 26 for processing.
- the current frame macroblock buffer 30 comprises M memory blocks with N cells.
- the organisation of the buffer 30 enables simultaneous storage of current frame macroblocks and the reading and loading of the next macroblock of the current frame.
- the memory blocks and registers 32 receive data serially.
- the organisation of the current frame input buffer is illustrated in figures 9 and 10.
- each of the B buffers comprises a series of memory blocks 1 to M each having N cells which are duplicated and which blocks have outputs to a respective one of M multiplexers whose outputs are passed to the processing elements of a given column.
- the comparator modules 34 MIN1-MINS of figure 3 sequentially compare the partial sums from parts PEL I to PEk.i and define the coordinates of the anchor frame macroblock for which the threshold criteria are achieved. These coordinates are passed to the output block 36 for output .
- Data loading algorithm
- Figures 11 to 15 show the steps in the anchor frame priming process to generate the macroblock coordinates.
- Figure 11 is an overview of the process and figures 12, 13 and 14 show, respectively, the READ F, WRITE T and WRITE F steps in more detail.
- Figure 15 is a schematic representation of a anchor frame.
- the first stripe 202a with upper left angle coordinates (1,1) will be loaded and processed in module Input II.
- the second stripe 202b with coordinates (1,68) will be loaded in module Input 12
- the third stripe 202c with coordinates (1,136) will be loaded in module Input 13
- the forth one 202k with coordinates (1,204) will be loaded in module Input 14.
- the stripes are loaded in sequence. All stripes are processed in parallel and in the same manner.
- Field F 204 is part of a stripe that represents number's matrix with the dimensions (M+l)xd.
- Column T 206 is part of stripe that represents the number's matrix with the dimensions Ixd.
- Each of memory modules II, I2,...,IM+1 (Fig.8) comprises two banks each having a volume d, one of which is using for the processing, the current operational bank, and the other is used for the loading the next portion of data.
- Field F is loaded in the bank that currently is used for loading.
- Each column T of the field F is loaded in the corresponding memory module. This operation is denoted Write F - field load and is shown in figure 14.
- the algorithm for the Write F operation provides sequential loading of columns T of field F in corresponding memory modules. In each memory module, column T is loaded sequentially according to the address AWF value.
- the data in this bank is ready for processing.
- the field F of the next anchor frame will be loaded further in the second memory bank.
- the operational memory bank two operations are performed: the field F read operation denoted Read F and the column loading operation denoted Write T. These to operations are illustrated in figures 12 and 13 respectively.
- the Read F operation represents the sequence of M+l simultaneous operand read operations from M+l memory modules according to the common address ARR.
- the initial address AR is equal to zero. After N read operations the initial address increments by one and the next N read operations are performed, and so on until the initial address becomes greater then d-N.
- data of the left column can be replaced with the data of the next to the right side of the field F right column.
- the Write T algorithm for the column T loading operates as follows. Firstly the coordinate Y is incremented by one and the new value is compared with the C value (frame vertical dimension) . If Y ⁇ C, the column loading operation continues. Write address AWT is calculated and then the AWT value is compared with the value of current read address AR (from Read F operation) . This comparison is necessary because read and write operations are performed from and to the same memory bank and Read F operation should provide the correct column T data reading. If AWT ⁇ AR and there is ready signal from register Rinl the data is loaded to the address AWT and so on until j ⁇ d.
- the whole loading algorithm for the loading of one stripe of reference frame is represented at the Fig.11.
- Firstly field F is loaded in memory through the Write F operation. This operation is synchronized by a ready signal from register Rin2.
- the finish of this operation is synchronized by the end of loading of S current frame macroblocks in module Input B.
- three parallel processes are being performed in the operational bank: Write F; Read F; and Write T.
- the last two processes is synchronized by read address AR. The finish of these processes also is synchronized.
- the embodiment described provides parallel processing of calculations, anchor and current frame data input and motion vector output through a matrix of processing elements and input modules for the anchor frame and current frame data and an output module for the motion vectors .
- Motion vectors are calculated in parallel for a set of current frame macroblocks and, preferably, to a half pixel precision.
- M sums of absolute difference are calculated in parallel in the processing elements and a single macroblock row of 16 pixels is processed in parallel.
- Pipeline processing is provided for in the calculation of the sum of absolute values of differences, the summing of those sums and the comparison of those sums to determine the closest anchor frame macroblock.
- forward predicted coding has been described with reference to forward predicted coding. It will be appreciated that it is equally applicable to bidirectional coding. The latter is achieved by performing the comparison operation for the current frame twice, once with the forward anchor frame and once with the backward anchor frame and then comparing the results of the two. The best of the two is then taken as the predicted frame.
- the motion estimator can operate on a whole frame of current macroblocks or, where the number of blocks is too high, can process the frame in a number of passes.
- An alternative would be to use two or more processors, however there is adequate time for at least two passes.
- the motion estimator described herein may be used in any environment in which MPEG 2 coding is required. This includes, for example, video signal encoding for broadcast or broadcast quality pictures for subsequent narrowcast or recordal, multipoint tele- or video conferencing equipment, DVD video encoders, video cameras including broadcast quality cameras and camcorders .
- video signal encoding for broadcast or broadcast quality pictures for subsequent narrowcast or recordal
- multipoint tele- or video conferencing equipment for applications such as multipoint teleconferencing, it is not practical for the search to be based on a full anchor frame and it is suitable to define a search window. As the amount of movement is likely to be small, it is believed that this approach is satisfactory and can give very significant improvements over presently available systems enabling rates of up to 15 frames per second on conventional ISDN links with a data rate of 128kB. In other applications the statistical approach of the whole frame search is more appropriate. It will be understood that the estimator as described affords the possibility of either solution, depending on the application.
- FIGs 16 to 20 show examples of how the embodiment of the invention described can be used in a variety of different applications, each using MPEG based video compression.
- an MPEG processor 248 which is the core part of all the applications.
- the MPEG processor comprises a programmable DSP engine 250 to support the basic functions of MPEG video coding and compression and decoding and decompression including DCT, IDCT, Q, Q "1 , VL coding and so on.
- the Motion Detection Processor 252 is a parallel-pipelined processor embodying the present invention.
- the complexity of the MDP engine 252 will depend on the demands of real-time video sequences being processed for and particular MPEG level and profile. Computational performance of the DSP engine 250 should also be consistent with the particular application.
- the MPEG processor proposed can be implemented using existing DSP processors, the example, TMS320C62 DSP processor. Thus it is necessary only to develop the MDP. This two chip solution can be used for the lower MPEG profiles and levels. For higher MPEG levels and profiles it may be necessary to develop a more powerful DSP engine. It is possible to develop a single chip solution for the MPEG processor due to its general structure as outlined above. In the case of a single chip solution, the processor will have one input Data bus and a single interface to the external RAM.
- Figure 17 illustrates how an embodiment of the present invention may used in a video conference system.
- videoconferencing systems are being developed mainly on a PC platform.
- the embodiment of figure 17 frees the Pentium (or other) PC processor from the hard computational task of determining motion vectors.
- the system controller 260 communicates with a PCI bus through a PCI interface 262, and with an MPEG processor 264 as illustrated in figure 16 and embodying the present invention over the system bus.
- the MPEG processor is coupled to a RAM 266 with which it can exchange data.
- the front end devices may either be attached to the system bus (solid line in figure 17) , or connected through the system controller 260 (shown as dotted lines in figure 17) .
- the MPEG processor encodes digital video and audio data from the front end devices.
- the MPEG data stream is output through the system controller and the PCI bus and can be further transported to the destination through the communications capabilities of the PC.
- the MPEG processor 262 also decodes incoming audio and video data which is received as an MPEG data stream on the PCI bus. Decompressed audio and video data is further available to the user through the PCI bus and the corresponding PC capabilities such as the monitor and sound blaster.
- the system outlined above is suitable for a number of videoconferencing systems such as point-to -point QCIF videoconferencing, multipoint QCIF videoconferencing and low-bit CIF videoconferencing on ISDN lines.
- the processing of audio data is optional and may be performed using PC software or by the DSP engine.
- a DVD s stem embodying the present invention has the same architecture as shown in figure 17. Differences may exist in the MPEG processor due to the need to compression conforming tc CCIR Rec 601 standard. To provide the corresponding MPEG level and profile, a more complex MDP engine is required. As the system is intended only to compress video and audio and to write an MPEG stream on DVD ROM through the PCS capabilities, the increase in DSP complexity, if any may be negligible.
- Figure 18 shows how an MPEG processor embodying the present invention and as shown in figure 16 may be used in a videophone system.
- the system is based on a QCIF videoconferencing system and is similar to the system illustrated in figure 17 except that it requires audio and video back end devices 272, 274 which provide digital to analog conversion of decompressed MPEG data.
- the system controller interface must include a modem interface 275 for exchange of digital MPEG data between the transmitting and receiving points. In this system, audio data processing is necessary.
- Figure 19 shows how an MPEG processor embodying the present invention and illustrated in figure 16 may be used, in conjunction with DVD technology for MPEG data storage to develop a digital video camera. This realisation relies on the availability of rewritable DVD- ROMs with sufficiently good speed characteristics.
- the arrangement is similar to that of figure 18 except that the audio and video back end devices are optional if play back is required, that a DVD controller 278 communicates with the system controller and that no modem is needed.
- Figure 20 shows an example of how the MPEG encoder embodying the invention and illustrated in figure 16 may be used as a television MPEG encoder.
- the circuit illustrated may be used in broadcasting equipment to encode a single television channel.
- the same configurations may be used for standard definition and HDTV with the difference being in the complexity of the MPEG processors.
- Present fabrication techniques can build a processor for standard definition on a single chip. At present several chips operating in parallel are required to support HDTV although it is envisaged the a single chip solution will be possible shortly as fabrication techniques improve.
- the architecture parameters depend on the values of the following primary data: A - frame horizontal dimension; C - frame vertical dimension; p - number of bits for pixel presentation; MxN - macroblock dimensions; Tc - time interval for single operation on pixel in the pipeline and memory read time interval; Tio - time interval for the external single information bit input/output;
- T time interval for the calculation of the motion vectors for the full current frame;
- L ax maximal number of input/output ports.
- Value of LB is calculating from the following expression:
- Value of LV is calculating from the following expression:
- D is the length of column that is loading in K Input I modules.
- the D value could be calculated from the following expression:
- D is calculating from the following expression: O ⁇ N/(l-(p*Tio/N*Tc) * (K/LIh) )
- K (A*C* (A-M) * (C-N) /M) * (Tc/T) /S (21) .
- Table 2 below represents the results of applying of the optimization procedures for various video formats.
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP99950926A EP1125441A1 (en) | 1998-10-19 | 1999-10-18 | Parallel processor for motion estimator |
AU63517/99A AU6351799A (en) | 1998-10-19 | 1999-10-18 | Parallel processor for motion estimator |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB9822799.4 | 1998-10-19 | ||
GB9822799A GB2343806A (en) | 1998-10-19 | 1998-10-19 | Parallel processor for motion estimator |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2000024203A1 true WO2000024203A1 (en) | 2000-04-27 |
Family
ID=10840842
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/GB1999/003438 WO2000024203A1 (en) | 1998-10-19 | 1999-10-18 | Parallel processor for motion estimator |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP1125441A1 (en) |
AU (1) | AU6351799A (en) |
GB (1) | GB2343806A (en) |
WO (1) | WO2000024203A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022056817A1 (en) * | 2020-09-18 | 2022-03-24 | Qualcomm Incorporated | Anchor frame selection for blending frames in image processing |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5512962A (en) * | 1993-05-25 | 1996-04-30 | Nec Corporation | Motion vector detecting apparatus for moving picture |
EP0723366A2 (en) * | 1995-01-17 | 1996-07-24 | Graphics Communications Laboratories | Motion estimation method and apparatus for calculating a motion vector |
US5594813A (en) * | 1992-02-19 | 1997-01-14 | Integrated Information Technology, Inc. | Programmable architecture and methods for motion estimation |
US5659364A (en) * | 1993-12-24 | 1997-08-19 | Matsushita Electric Industrial Co., Ltd. | Motion vector detection circuit |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3089165B2 (en) * | 1994-11-10 | 2000-09-18 | 株式会社グラフィックス・コミュニケーション・ラボラトリーズ | Motion vector search device |
-
1998
- 1998-10-19 GB GB9822799A patent/GB2343806A/en not_active Withdrawn
-
1999
- 1999-10-18 WO PCT/GB1999/003438 patent/WO2000024203A1/en not_active Application Discontinuation
- 1999-10-18 AU AU63517/99A patent/AU6351799A/en not_active Abandoned
- 1999-10-18 EP EP99950926A patent/EP1125441A1/en not_active Withdrawn
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5594813A (en) * | 1992-02-19 | 1997-01-14 | Integrated Information Technology, Inc. | Programmable architecture and methods for motion estimation |
US5512962A (en) * | 1993-05-25 | 1996-04-30 | Nec Corporation | Motion vector detecting apparatus for moving picture |
US5659364A (en) * | 1993-12-24 | 1997-08-19 | Matsushita Electric Industrial Co., Ltd. | Motion vector detection circuit |
EP0723366A2 (en) * | 1995-01-17 | 1996-07-24 | Graphics Communications Laboratories | Motion estimation method and apparatus for calculating a motion vector |
Non-Patent Citations (4)
Title |
---|
ACKLAND B D: "A VIDEO-CODEC CHIP SET FOR MULTIMEDIA APPLICATIONS", AT & T TECHNICAL JOURNAL,US,AMERICAN TELEPHONE AND TELEGRAPH CO. NEW YORK, vol. 72, no. 1, 1 January 1993 (1993-01-01), pages 50 - 66, XP000367735, ISSN: 8756-2324 * |
DE GREEF E ET AL: "Mapping real-time motion estimation type algorithms to memory efficient, programmable multi-processor architectures", MICROPROCESSING AND MICROPROGRAMMING,NL,ELSEVIER SCIENCE PUBLISHERS, BV., AMSTERDAM, vol. 41, no. 5, 1 October 1995 (1995-10-01), pages 409 - 423, XP004002606, ISSN: 0165-6074 * |
KOMAREK T ET AL: "ARRAY ARCHITECTURES FOR BLOCK MATCHING ALGORITHMS", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS,US,IEEE INC. NEW YORK, vol. 36, no. 10, 1 October 1989 (1989-10-01), pages 1301 - 1308, XP000085317 * |
SCHIEFER P: "PICTURE PROCESSING RAMS (PPRAMS) FOR MOTION ESTIMATION", IEEE TRANSACTIONS ON CONSUMER ELECTRONICS,US,IEEE INC. NEW YORK, vol. 38, no. 3, 1 August 1992 (1992-08-01), pages 570 - 575, XP000311895, ISSN: 0098-3063 * |
Also Published As
Publication number | Publication date |
---|---|
AU6351799A (en) | 2000-05-08 |
GB9822799D0 (en) | 1998-12-16 |
GB2343806A (en) | 2000-05-17 |
EP1125441A1 (en) | 2001-08-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | A new block-matching criterion for motion estimation and its implementation | |
US5347309A (en) | Image coding method and apparatus | |
JP4001400B2 (en) | Motion vector detection method and motion vector detection device | |
US20050013366A1 (en) | Multi-standard variable block size motion estimation processor | |
EP0720386A2 (en) | Temporally-pipelined predictive encoder/decoder circuit and method | |
EP1653744A1 (en) | Non-integer pixel sharing for video encoding | |
Akiyama et al. | MPEG2 video codec using image compression DSP | |
JPH1169345A (en) | Inter-frame predictive dynamic image encoding device and decoding device, inter-frame predictive dynamic image encoding method and decoding method | |
JPH09247679A (en) | Video encoder in compliance with scalable mpeg2 | |
JPH10150666A (en) | Method for compressing digital video data stream, and search processor | |
US20080123748A1 (en) | Compression circuitry for generating an encoded bitstream from a plurality of video frames | |
EP0577310B1 (en) | Image processing device | |
US8451897B2 (en) | Highly parallel pipelined hardware architecture for integer and sub-pixel motion estimation | |
EP1389875A2 (en) | Method for motion estimation adaptive to DCT block content | |
US20050036550A1 (en) | Encoding and transmitting video information streams with optimal utilization of a constrained bit-rate channel | |
JPH089375A (en) | Inverse discrete cosine transformation anticoincidence controller and picture encoding device | |
JPH07274181A (en) | Video signal encoding system | |
KR20020067192A (en) | Video decoder having frame rate conversion and decoding method | |
EP1125441A1 (en) | Parallel processor for motion estimator | |
US20080282304A1 (en) | Module and architecture for generating real-time, multiple-resolution video streams and the architecture thereof | |
Campos et al. | Integer-pixel motion estimation H. 264/AVC accelerator architecture with optimal memory management | |
KR20000018311A (en) | Method for presume motion of image system and apparatus | |
Goh et al. | Real time full-duplex H. 263 video codec system | |
KR920010514B1 (en) | Digital signal processing apparatus | |
Hayashi et al. | A bidirectional motion compensation LSI with a compact motion estimator |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref country code: AU Ref document number: 1999 63517 Kind code of ref document: A Format of ref document f/p: F |
|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 1999950926 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 1999950926 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 09830049 Country of ref document: US |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1999950926 Country of ref document: EP |