EP1964053A2 - Procede pour traiter un objet dans une plateforme a processeur(s) et memoire(s) et plateforme utilisant le procede - Google Patents
Procede pour traiter un objet dans une plateforme a processeur(s) et memoire(s) et plateforme utilisant le procedeInfo
- Publication number
- EP1964053A2 EP1964053A2 EP06847180A EP06847180A EP1964053A2 EP 1964053 A2 EP1964053 A2 EP 1964053A2 EP 06847180 A EP06847180 A EP 06847180A EP 06847180 A EP06847180 A EP 06847180A EP 1964053 A2 EP1964053 A2 EP 1964053A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- sub
- specific
- platform
- operations
- processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
- 230000015654 memory Effects 0.000 title claims abstract description 192
- 238000012545 processing Methods 0.000 title claims abstract description 156
- 238000000034 method Methods 0.000 title claims abstract description 123
- 230000008569 process Effects 0.000 claims abstract description 36
- 238000004364 calculation method Methods 0.000 claims description 182
- 238000004891 communication Methods 0.000 claims description 44
- 230000006870 function Effects 0.000 claims description 37
- 238000006073 displacement reaction Methods 0.000 claims description 23
- 230000002123 temporal effect Effects 0.000 claims description 15
- 238000000354 decomposition reaction Methods 0.000 claims description 10
- 238000009877 rendering Methods 0.000 claims description 9
- 230000005236 sound signal Effects 0.000 claims description 4
- 230000005540 biological transmission Effects 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 description 20
- 238000013519 translation Methods 0.000 description 13
- 230000014616 translation Effects 0.000 description 13
- 230000000694 effects Effects 0.000 description 11
- 241001208007 Procas Species 0.000 description 8
- 238000012937 correction Methods 0.000 description 7
- 230000007547 defect Effects 0.000 description 7
- 230000008859 change Effects 0.000 description 5
- 230000007246 mechanism Effects 0.000 description 5
- 238000012163 sequencing technique Methods 0.000 description 5
- 238000005259 measurement Methods 0.000 description 4
- 238000004088 simulation Methods 0.000 description 4
- 238000012546 transfer Methods 0.000 description 4
- 238000011144 upstream manufacturing Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000008520 organization Effects 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- AGJBKFAPBKOEGA-UHFFFAOYSA-M 2-methoxyethylmercury(1+);acetate Chemical compound COCC[Hg]OC(C)=O AGJBKFAPBKOEGA-UHFFFAOYSA-M 0.000 description 2
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 230000006837 decompression Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 229910052710 silicon Inorganic materials 0.000 description 2
- 239000010703 silicon Substances 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 230000010363 phase shift Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
Definitions
- the present invention relates to a method for processing, in a processor platform (s) and memory (s), an object consisting of basic information. It also relates to a platform using such a method.
- an elementary information element is an information element to be processed, represented by one or more numerical values.
- This information can be encoded according to various types of coding such as 8-bit coding, 10-bit coding or signed 16-bit coding.
- the elementary information will be the pixels of this image.
- the processing that will be applied to the object in the platform corresponds to an algorithm that can intervene in different domains such as, for example, image processing, data compression or decompression, sound processing, signal modulation and demodulation. , measurement, data analysis, indexing or searching a database, computer vision, graphic processing, simulation or any area that uses a large amount of data.
- the programmable processors for example of the scalar processor, vector processor, signal processing processor or vector processor, particularly SIMD type ("Single Instruction Multiple Data"), allow to apply an algorithm on a compound object elementary information and decomposed into blocks, or sub-objects, that is, into groups of elementary information.
- Each operation is usually applied to a complete block, then the next operation is carried out on a block which has been reduced in size. Indeed, some of these operations have the effect of reducing the size of the blocks, resulting in edge effects during the application of the following operations. Consequently, when it is desired to apply an algorithm with a known processor, it is necessary to perform a large number of memory accesses, since an operation is applied to all the blocks successively before going on to the next operation. this situation leads to frequent writing and reading in the memory. It is also necessary to use large blocks to reduce edge effects, and therefore must have a memory of relatively large size to store these large blocks. In addition, a large number of loops means that the initialization code of the loops and end of loop is present a large number of times, which induces a large code size.
- mapping tables or the application of local displacement, are not suitable for implementation on a vector processor with a shift or swap communication scheme.
- each platform has its own characteristics, both in terms of hardware (for example number and type of processors or size and type of memory), and in terms of language used (C for a scalar processor and assembler for a vector processor ).
- the object of the invention is to provide a method for treating an object in a platform, by overcoming at least one of the disadvantages mentioned above.
- the invention makes it possible to optimize the processing in terms of code size, calculation time, memory access rate, and memory size. This optimization makes it possible to reduce the calculation time of an algorithm and the power consumption required for a computing power and a memory size, and therefore a silicon surface, data.
- the invention relates to a method for processing, in a processor (s) and memory (s) platform, an object constituted by elementary information of the same nature, this method comprising the step of decomposing the object to be processed into at least two sub-objects consisting of N elementary information each, all the sub-objects having the same number N of elementary information, and the processing consisting of performing at least one sequence of specific operations on the elementary information of each sub-object, the method further comprising the step of performing, for each sub-object, at least N times each specific operation, so, of on the one hand, to involve at least once each elementary information of each sub-object, and, on the other hand, to produce N results for each specific operation, the sequence of specific operations being such that at least one specific operation k of the sequence produced at least once during its N applications, a result used for the processing of another sub-object.
- the specific sequence of operations does not include a loop.
- the platform comprises Q processors.
- the processing of each sub-object is distributed over the Q processors, which each perform at least one specific operation IS8 of the specific sequence of operations. So that all the processors are used for each sub-object and so that the same processor is used for all the sub-objects. Thus, it is not necessary to assign the sub-objects to the processors.
- the same specific operation is also performed by the same processor for the processing of all other sub-objects.
- the processing is thus regular: the specific operations are assigned to the processors and carried out periodically thereafter for each sub-object subprocessing.
- all the loops required for processing depend on the topology of the object and the platform, but are independent of the sequence of specific operations.
- the loops are nested within each other around the specific sequence of operations complete.
- the loops encapsulate the whole of the specific sequence of operations and the specific sequence of operations is not split into subsequences each surrounded by loops.
- the sub-objects are composed of contiguous elementary information.
- processor chaining comprising at least one queue.
- a queue makes it possible to transmit and / or store elementary information or results of specific operations.
- a queue may include or use a memory.
- a queue can be implemented using one or more FIFO processes ("First in first out").
- a queue has at least one entry and at least one exit.
- a queue may be functionally connected by any means to an input computing unit and an output computing unit.
- a queue can also be connected functionally by any means to PR input calculation units and PR calculation units output, in this case the queue behaves like PR files each connecting an input calculation unit with a calculation unit output.
- a queue makes it possible to independently manage several streams of data, each stream being associated with a specific specific instruction.
- a queue uses at least one unit of memory making it possible to store, for each stream, an identical number NF of data.
- NF is determined according to the relative disposition of the sub-objects and the mode of travel, so that NF-I sub-objects are processed between the processing of a sub-object producing a datum and the processing of the sub-object. object using the data.
- a chaining comprising calculation units and a queue comprises a mechanism for managing the boot: the queue is initialized regularly, for example at the beginning of each line if the queue is part of a horizontal chaining and the object is an image; as long as the queue does not understand NF data, the processor that follows the queue in the chaining takes as input the data that it sends out; then, the processor that follows the queue in the chaining takes the oldest data in the queue as input and removes it from the queue.
- the queue makes it possible to output the data in the same order as they were entered in the queue.
- the circular chaining is unidirectional.
- the circular chaining is such that there is a single input link and a single output link per calculation unit.
- a queue is, for example, implemented using a microprocessor.
- each of the specific operations of the sequence is performed N times in total and N / Q times by each of the Q processors.
- each processor performs the part of the sequence taking into account these conditional branches.
- the sub-objects have no overlap in at least one dimension. So that at least one specific operation result produced when processing a sub-object is used when processing another sub-object.
- the sub-objects have no overlap in any dimension. So that we can use each processor 100% without repeating calculation.
- the sequence of specific operations is such that at least one specific operation k of the sequence occurs at least once during its N applications, a result used for the processing of another sub-object.
- the queue is shared between all the circular chaining of the same dimension.
- there is exactly one queue per dimension of the sub-object and each queue is shared between all the circular chaining in the same dimension. So the communication between the processors is particularly simple. So that the organization of memory is particularly simple.
- the invention also makes it possible to obtain, for a given algorithm, a performance proportional to the number of processors, without changing the algorithm or the memory size, and while using small processors.
- the invention makes it possible to make available, on a component, a computing power of several tens of billions of operations per second per mm 2 , for a component etched with a 0.09 ⁇ m process. These densities usually require an optimized cable architecture, which is therefore long to develop, and has no flexibility to change. algorithm. On the contrary, the invention makes it possible to program any algorithm with great ease and therefore in a very short time.
- the invention makes it possible to make a treatment consisting of operations with edge effects regular.
- the objects treated by a process such as that described above are preferably raw images (of the "raw” type) before the demosaicing operation, in which case:
- the elementary information is a pixel represented by a corresponding numerical value, according to the absolute position of the pixel, for example to red, green or blue;
- an elementary item of information is a group of pixels (for example a group of 2 * 2 green, red, blue and green pixels corresponding to a "Bayer") represented by a numerical value per pixel.
- the objects can also be visible images, in which case the elementary information is a pixel represented, for example, by three numerical values, each of the values representing a color, for example red, green and blue.
- the objects can also be sequences of images, in particular raw or visible, in which case the elementary information is a pixel of an image of the image sequence.
- the objects therefore correspond, for example, to videos.
- the image may be from an image capture apparatus and / or intended for an image rendering apparatus:
- An image capture apparatus is, for example, example, a disposable camera, digital camera, SLR camera (digital or not), scanner, fax, endoscope, camera, camcorder, surveillance camera, toy, camera or integrated or connected camera a telephone, personal assistant or computer, a thermal camera, an ultrasound machine, an MRI (magnetic resonance) imaging device, an X-ray machine.
- An image rendering apparatus is, for example, a screen, a projector, a television set, virtual reality glasses, or a printer.
- a device for capturing and restoring images is, for example, a scanner / fax / printer, a mini lab for printing photos, a video conference device.
- the processing platform can take various forms depending on the application.
- the object is an image
- An image capture device that produces images processed for example a digital camera that incorporates a processing platform.
- An image rendering apparatus which displays or prints processed images, for example a video projector or a printer including a processing platform.
- a mixed device that corrects the defects of its elements, for example a scanner / printer / fax including a processing platform.
- a professional image capture device that produces processed images, for example an endoscope including a processing platform.
- the processing platform may be deported in whole or in part to a server.
- An algorithm, or an object processing corresponds, for example, without the list being limiting in the case where the object is an image to: a calculation, in particular of statistics, for a white balance, and / or
- a treatment notably for improving color rendition, and / or a treatment, in particular for improving the rendering of the contrast, and / or
- a treatment in particular for improving the rendering of the details, and / or
- noise reduction and / or - a measurement, and / or - compression, and / or
- the processing applied to the object may consist of a sequence of operations, also called specific operations.
- the results of specific operations are also called basic information and may or may not be of the same type as the basic information of the object.
- Sub-objects are sets of basic information with a shape and size that, depending on the case, depend on the characteristics of the platform, including the size and type of memory and, in the case of a vector processor , the size of a vector, but also the characteristics of the object to be treated.
- the objects and sub-objects as well as the logical blocks have several dimensions.
- the dimensions of the subobjects and logical blocks correspond to all or part of the dimensions of the object.
- the dimensions can be of various natures, in particular:
- -spatial for example a distance, an angle or a course in a mesh
- -frequency for example color, frequency, frequency band
- a decomposition according to a vector space base for example a decomposition into wavelets, or a decomposition into heavy weights, low weights, -in general, the dimensions of any topology space.
- a raw fixed image with 2 dimensions, each corresponding to distances, the pixels being each provided with a color, for example red, green or blue,
- a medical image with dimensions of distance and, possibly, channel dimensions, a hologram with dimensions of angle of view,
- a modulated signal with one or more dimensions, corresponding to time and, optionally, a frequency and, possibly, a position in space or an angle,
- an object with one or more dimensions.
- the elementary information of an object can have a position and / or an absolute scale notably spatial and / or temporal and / or frequency and / or in any other dimension of the object:
- an elementary information of an object "sound” can correspond to an intensity; in this case, the elementary information has an absolute position corresponding to a given instant and in the case of a multichannel sound for a given channel, - an elementary information of an "image" object can correspond to a pixel; in this case, the elementary information has an absolute position corresponding to a position in the image and, in the case of a video image at a given instant, an elementary information of a "simulation data" object can correspond to a state; in this case, an elementary information has an absolute position corresponding to a mesh node and at a given instant,
- an elementary information of a "modulated signal" object may correspond to an intensity and / or a phase; in this case, an elementary information has an absolute position corresponding to a given instant and possibly a given frequency and possibly a given position if several antennas or transmitters are used.
- the relative positions and the absolute or relative scales, according to at least one dimension, in particular spatial and / or temporal, can correspond to various concepts according to the nature of the object. They apply between any 2 blocks, whatever their type (in the case of an image as described above, a logic block can notably be raw, red, green, 8 bits, etc.).
- the absolute or relative position may correspond, in one embodiment, to 2 values (vertical and horizontal) and the absolute or relative scale to 2 values (vertical and horizontal) .
- the pixels of the top line of an object can have as absolute positions (0; 0) (0; 1) (0; 2) ..., and the pixels of the n th line can have as absolute positions (n 0) (n; 1) (n; 2).
- the relative positions can be coded in the following way: (-1; 0) indicates at the top, (0; 0) corresponds to a zero displacement, (0; 1) indicates to the right and (2; -2 ) indicates 2 pixels below and 2 on the left; a relative scale of (0.5; 0.5) corresponds to a resolution of half in each direction.
- the absolute or relative position may correspond to 2 values (vertical and horizontal) and the scale absolute or relative to 2 values (vertical and horizontal);
- the pixels of the top line of an object can have as absolute positions (0; 0) (0; 1) (0; 2) ...
- the pixels of the n th line can have as absolute positions (n 0.5) (n; 1.5) (n; 2.5) if the line is odd, and (n; 0) (n; 1) (n; 2) if the line is even.
- the relative position can correspond to 2 values (vertical and horizontal), for example (-0.5, 0.5) indicates upper right, (0.1) indicates right and (-0.5; ) indicates the pixel to the right of the pixel at the top right.
- a relative scale of (0.5; 0.5) corresponds to a resolution of half in each direction.
- a combination of relative displacement and relative scale can be coded using 2 functions f and g as follows: (f (x; ⁇ ); g (x; ⁇ ))) for each pixel absolute position x, y. It should be noted that a rounding rule is necessary in order to take, for example, the nearest pixel.
- the absolute or relative position can correspond to 3 values (vertical, horizontal and temporal), for example (- 1; 0; 0) indicates the pixel located above in the same image, (0; 0; -1) indicates the pixel having the same position in the previous image and (2; -2, -1) indicates the pixel located 2 pixels below and 2 on the left in the image. previous picture.
- a combination of relative displacement and relative scale can be encoded using 3 functions f, g, h as follows: (f (x; ⁇ ; t); g (x; y; t )); h (x; y; t)) for each pixel of absolute position x, y at time t. It should be noted that a rounding rule is necessary in order to take, for example, the nearest pixel.
- the absolute or relative position may correspond to 1 value
- the absolute or relative position can correspond to 2 values
- (Time, channel), for example (-1, 0) indicates the previous instant of the same channel, and (2.1) indicates 2 instants after the next channel, ordered for example spatially in a circular manner.
- a combination of relative displacement and relative scale can be encoded using 2 functions f, g as follows: (f (t; c); g (t; c)) for each sound sample position at time t for channel c. It should be noted that a rounding rule is necessary to take, for example, the instant and the nearest channel.
- the absolute or relative position may correspond to n values each corresponding to a spatial or temporal dimension, a function of the topology of the mesh.
- n functions each corresponding to a spatial or temporal dimension, a function of the topology of the mesh.
- a combination of relative displacement and relative scale can be encoded using n functions. It should be noted that a rounding rule is necessary in order to take, for example, the node and the nearest instant.
- the absolute or relative position may correspond to n values respectively corresponding to the time, if any, to the frequency channel (transmission or reception on several frequencies) and if necessary (several transmitters or spatially arranged receivers) to a spatial dimension.
- a combination of relative displacement and relative scale can be encoded using n functions, and a rounding rule should be chosen.
- the absolute or relative position may correspond to n values each corresponding to a dimension of the object which, depending on the case, may be temporal, spatial, frequency, phase Or other.
- a combination of relative displacement and relative scale can be coded using n functions and a rounding rule should be chosen.
- the absolute or relative position may correspond to n values each corresponding to a dimension of the object which, depending on the case, may be of a temporal, spatial, frequency, phase nature. Or other.
- n values each corresponding to a dimension of the object which, depending on the case, may be of a temporal, spatial, frequency, phase nature. Or other.
- a combination of relative displacement and relative scale can be encoded using n functions and a rounding rule should be chosen.
- FIGS. 1a-1d Different types of non-overlapping sub-objects are illustrated by FIGS. 1a-1d.
- the same image can be cut into lines (lines 90, 91, 92 and 93 in FIG. (columns 94, 95, 96 and 97 in FIG. 1b), in sub-objects of a completely different shape (forms 70, 71, 72 and 73 in FIG. 1c), or in rectangles (forms 60, 61, 62, 63, 64, 65, 66 and 67 in Figure 1d).
- the sub-objects are without overlap, it is necessary to access elementary information of at least one other sub-object to process the elementary information of a sub-object without losing an edge during the calculation of filters.
- the object to be processed having DO dimensions, and being decomposed into sub-objects having DSO dimensions selected from the DO dimensions of the object the decomposition of the object is such that, according to at least one dimension sub-object, the sub-objects have no overlap.
- the method further comprises the step of performing, for each sub object, exactly N times each specific operation. Preferably one will choose DSO equal to DO.
- the method further comprises the step of adding at least one elementary information to the object so that it can be decomposed into sub-object without overlap.
- the sub-object decomposition can also depend on the sequence of operations to be performed on the object, including the number and type of filters, horizontal or vertical, present in this sequence.
- FIG. 1a shows a sub-object composed of 6 ⁇ 6 elementary information, in the case where the sequence of operations loses a pixel on each edge, and FIG represents an object comprising 100 basic information.
- the sub-objects are four rectangles 80, 82, 83 and 84 each containing 36 elementary information.
- the rectangle 80 consists of the 36 elementary information located at the top left in the image, and the rectangle 82 is made up of the 36 elementary information at the top right of the image.
- the 8 elementary information 86 is common to the two sub-objects 80 and 82.
- the 8 elementary information 85 is common to both sub-objects 80 and 83.
- the 8 elementary information 88 is common to both sub-objects 82 and 84.
- the 8 elementary information 89 are common to both sub-objects 83 and 84.
- the 4 basic information 87 are common to the four subobjects 80, 82, 83 and 84.
- the object is an image
- the image is decomposed into juxtaposed rectangular sub-objects
- the sub-objects are processed, for example, from left to right and then from top to bottom.
- the sub-objects are chosen and stored in one of the following ways, without the list being exhaustive:
- the size of the sub-objects is chosen to be able to perform the processing of a sub-object without access to the slow memory; For example, sub-objects corresponding to squares of 32x32 pixels may be taken, the result of the calculation on the preceding sub-object being transferred to slow memory during the calculation relating to the current sub-object, and during the transfer of the slow memory to the fast memory of the data necessary for calculation relating to the following sub-object
- the size of the sub-objects is chosen to be able to process a sub-object using the most cache memory possible; for example, sub-objects corresponding to squares of 32x32 pixels or sub-objects of 1 pixel or sub-objects of 4 pixels (2 * 2) or Nl * 2 pixels may be taken, in particular in the case raw image, type "raw"
- the size of the sub-objects is chosen as equal to, or multiple of, the size of a vector that the platform knows how to process and store, it will be possible, for example, to take sub-objects objects corresponding to 64 horizontal pixels.
- the decomposition into sub-object can be adapted in a manner similar to the platform.
- the method according to the invention makes it possible to regularize the sequencing of the specific operations performed on the sub-objects, since the same number N of operations is performed each time.
- N the number of operations performed each time.
- the fact of carrying out N operations each time is made possible by the fact that, in carrying out these operations, elementary information belonging to a sub-object different from that on which the operations are applied is involved.
- the elementary information on which the operation k applies may belong to the same sub-object or to different sub-objects, depending on the type of specific operation and the position of the elementary information in the sub-objects. objects.
- the platform comprises at least one inter-object communication memory for storing the elementary information and / or results of specific operations calculated during the processing of a sub-object and used for the processing of another sub-object. object.
- the redundant calculations are reduced.
- sequence of specific operations comprises only one specific operation implementing the same data during the processing of the object.
- communication data will be called elementary information and / or results of operations that are used for the processing of several sub-objects or for several different specific operations.
- the communication data between sub-objects will be chosen such that their size is the number of computations being minimized.
- the inter-object communication data according to one dimension includes, in particular, the input data of a filter according to this dimension, as well as the data to be combined with the output of the filter, if they are not correctly aligned between they.
- the inter-object communication memory that is to say the memory used to store the inter-object communication data, is of different nature depending on the necessary storage time and the bit rate.
- this memory may consist of registers and / or local memory, for communication according to the size of the inner loop of the sub-objects, and / or a local memory and / or shared for communication according to the other dimensions.
- the object comprises DO dimensions
- the elementary information is transmitted to the platform first according to a selected dimension DE, then according to the other dimensions.
- the sub-objects have DSO dimensions, selected from the DO dimensions of the object, and including the dimension DE, and the processing comprises at least one inner loop of sub-object traversing performed according to the dimension DE.
- An internal loop corresponds to a loop for processing a sub-object, which makes it possible to process the N data using Q processors.
- This embodiment is particularly adapted to the case of a component processing data "on the fly", that is to say in real time at the input speed of the elementary information in the platform, using, for inter-object communication, a memory located on the same component as the calculation units used for processing.
- a component processing data "on the fly” that is to say in real time at the input speed of the elementary information in the platform, using, for inter-object communication, a memory located on the same component as the calculation units used for processing.
- the cost of the component is reduced, and the access rate to the memory is proportional to the number of calculation units.
- this embodiment is used in the case of a scalar processor, or vector, or pipeline.
- the sub-objects are such that DSO is one, or DSO is two in the case of a raw image. In the latter case, the size of the sub-object in the second dimension is two.
- the specific operations are performed by calculation units arranged according to the dimension DE.
- the size of the sub-objects in each dimension is a multiple of the size of the processor matrix in the considered dimension.
- the DSO dimensions are the DE dimension, as well as the smaller ones of the OD dimensions to limit the necessary inter-object communication memory.
- the loops of the sub-object are nested in the same order as the dimensions according to which the elementary information arrives at the platform.
- the elementary information is transmitted to the platform according to the dimension DE of the object, then according to the other dimensions.
- the sub-objects comprise the OD dimensions of the object or DO-I dimensions selected from the DO dimensions of the object, the dimension DE being not included.
- the processing further comprises at least one inner loop of sub-object traversing performed according to the dimension DE.
- the size of the sub-objects in each dimension is determined according to the size of the object, and / or the transmission rate of the elementary information transmitted to the platform, and / or the speed of calculation of the the platform and / or the size and throughput of at least one platform memory.
- This embodiment is particularly adapted to the case of a component processing data "on the fly" at the input speed of the elementary information in the platform using for inter-object communication, a local memory located on the same component as the units of calculation used for the processing serving as a relay to an external memory shared slower used for long-term storage of communication data according to the DO-I a dimension not corresponding to DE.
- the size of the local memory increases with the size of the sub-object
- the bit rate with the shared memory decreases with the size of the sub-object
- the size of the external memory increases with the size of the sub-object according to the dimensions other than the dimension DE, it is thus possible to adjust the size of the internal and external memories and the bit rate of the external memory by adjusting the size of the sub-object.
- the cost is reduced, the throughput with the memory is independent of the size of the object, and a component can be optimized for an object size and reused with external memory for larger objects.
- This embodiment is also more particularly adapted to the case of a component processing the elementary information more slowly than the input speed of the elementary information in the platform, and using a memory to store the object during the processing.
- it is sought to limit the size of the internal memory and the processing speed to reduce the internal memory size, the number of calculation units and the memory capacity required.
- This embodiment applies in particular to a scalar or vector processor or pipeline.
- the specific operations are carried out by Q units of computation arranged a dimension of size greater than Q.
- the size of the sub-objects in each dimension is a multiple of the size of the processor matrix in the considered dimension.
- the DSO dimensions are, in addition to the dimension DE, the smallest of the OD dimensions to limit the inter-object communication memory required.
- the loops are nested and the code is compact.
- the inter-object communication data produced during the calculation of the previous sub-object is transferred from the local memory to the external memory, and is transferred from the external memory to the local memory.
- the inter-object communication data necessary for the calculation of the next sub-object Since the internal loop of the subobjects travels according to the dimension DE, the transfers between internal and external memory concern only the inter-object communication data according to the DO-I dimensions which exclude the dimension DE.
- the necessary local memory is limited to 3 times the size of the inter-object communication data according to these DO-I dimensions plus once the size of the inter-object communication data according to the DE dimension.
- the size of the required internal memory is limited to a few hundred or thousands of bytes to process several tens of millions of pixels per second.
- the data flows in the following way: the elementary information of the object is stored in the external memory, the sub-objects are read from the external memory when at least one sub-object is present in shared memory, the inter-object communication data according to the DO-I dimensions which exclude the dimension DE are read from the shared memory, the result of the processing of the sub-object and the inter-object communication data according to the DO-I dimensions which exclude the dimension DE are written in shared memory, when the complete data according to the dimension used for the output are present in shared memory, they are read in shared memory and available.
- the transit is thus predictable, simple and regular.
- the loops of the subobjects are nested in the same order as the dimensions according to which the elementary information arrives at the platform.
- the specific operation sequence comprises at least one specific selection operation which selects a parameter value among C parameter values at the same time on the Q calculation units. This selection is done in a differentiated manner by processor, according to at least one elementary information and / or at least one specific operation result and / or at least one parameter value.
- C is 8.
- a specific selection operation can be used as a function of the input value n giving the interpolation coefficient.
- the Q processors corresponding to a one-dimensional vector the constant C's are extended in a register of Q constants by duplication: C constant to the right of the vector, then C constants then ...; the specific selection operation makes it possible to choose one of the C values on the left of each element of the vector.
- the sequence of specific operations comprises at least one specific selection operation performing the selection of one of C data given at the same time on the Q calculation units in a differentiated manner by processor, as a function of a relative displacement obtained from d at least one elementary information and / or at least one specific operation result and / or at least one parameter value.
- each computing unit can access data from its left neighbors C simultaneously and independently.
- the specific selection operation may be conditional, in order to allow selection from a number of 2 * C or more of data.
- a local deformation for example a distortion according to a position
- the displacement requiring more than C data can be decomposed into at least one uniform displacement followed by a differentiated local displacement carried out using at least one selection operation, the uniform displacement can be realized, in particular, by applying several selection operations, or by using a communication memory.
- the relative displacement is common to all the elementary information of a sub-object, and / or an object. In another example, it is different for each elementary information, and may or may not depend on the absolute position of the elementary information in the sub-object and / or the sub-object in the object. More generally, this displacement is the result of a calculation based on at least elementary information and / or at least one specific operation result and / or at least one parameter value.
- the position specific operation produces position information according to one of the DO dimensions.
- the position can be, in particular, without the list being limiting, an absolute position of elementary information in an object, a sub-object position, a modulo C processor position, a multi-scale data position in a sub-object. object, a relative position with respect to a grid or any other position.
- the sequence of specific operations comprises at least one specific operation producing a relative position as a function of at least one elementary information and / or at least one specific operation result and / or at least one parameter value.
- the specific operation of relative position calculation can be used before a specific selection operation.
- the relative position is common to all the elementary information belonging to a sub-object, and / or to an object. In another example, it is different for each elementary information, or it may depend, or not, on the absolute position of the elementary information in the sub-object and / or the sub-object in the object. More generally, it can be the result of a computation based on at least one elementary information and / or at least one specific operation result and / or at least one parameter value.
- the size of the sub-objects namely the number N of elementary information present in each object, is, for example, determined according to the architecture of the platform used for processing.
- at least a portion of the specific operations is performed by Q units of calculations, Q being equal to N or a sub-multiple of N. The fact that N is a multiple of Q makes the treatment even more regular, since all calculation units at the same time complete a same calculation step.
- the number of processors Q and the number of elementary information N are different, and the processing of the sub-object comprises a single internal loop of N / Q iterations.
- the processing is regular, the memory and the number of registers used are minimized, and the communication within each sub-object is preferably done with registers.
- the number of calculation units is from a few tens to a few hundreds, which makes it possible, in particular, to perform calculations of a few hundred operations on images at several tens of millions of pixels per second using manufacturing processes of 0.13 ⁇ components.
- the number of calculation units is from several thousand to several million, and the invention makes it possible to use this computing power to process objects while keeping a great simplicity of programming and with a performance proportional to the number of calculation units.
- the number P of specific operations can also be a multiple of Q.
- specific operations are determined upstream of the platform by a compiler, which is configured from so that, if the number of specific operations is not a multiple of Q, it creates specific operations with no effect in order to obtain this relation (the number of specific operations is a multiple of Q).
- the treatment will be perfectly regular.
- the Q calculation units present in the platform are identical.
- parameters may, for example, be multiplying coefficients.
- These parameters may correspond, for example, without the list being limiting to: filter coefficients, and / or
- saturation values and / or offset values, and / or mapping tables.
- the values of the parameters used by the specific operations depend on the position in the sub-objects of the elementary information involved, directly or indirectly, in these specific operations.
- defects may appear on the image, due to the optics used to take the picture. These defects are, in general, not homogeneous throughout the image, especially on the edges.
- the use of a parameter common to all the elementary information for a filter makes it possible to increase the sharpness in a uniform manner.
- the use of a parameter depending on the absolute position of the elementary information in the object to processing, for a filter makes it possible to increase the sharpness more importantly at the edge in order to compensate for an optical defect.
- the use of a parameter dependent on the absolute position of the elementary information in the object to be treated, for a vignetting correction makes it possible to obtain a stronger edge compensation in order to compensate for an optical defect.
- a parameter depending on the absolute position of the elementary information in the object to be treated for a de-keying makes it possible to treat the red pixels, the green pixels and the blue pixels differently.
- second data in particular a displacement
- zoom which depends on the absolute position of the elementary information in the object to be processed for a numerical zooming calculation ("zoom") or a distortion correction
- a parameter depending on the nature of this parameter, can:
- the parameter value may, in particular, be transmitted to the processing means or to the platform, and / or to extension of the origin or destination of the object, for example in the case where the object to be processed is an image from an apparatus provided with a given optic, the value of the parameter may depend on the type of optics that has an impact on the level of blur in the image; in this case, the parameter value can in particular be transmitted to the processing means or to the platform, and / or
- the value of the parameter may depend on the gain of the sensor actually used to capture said object that has an impact on the noise level in the image; in this case, the parameter value may, in particular, be transmitted, chosen or calculated by the platform, and / or
- the parameter value can in particular be transmitted, chosen or calculated by the platform, and / or
- the parameter value can be determined simultaneously or a posteriori with respect to the definition of the algorithm.
- the value of the parameter is calculated at each change.
- the possible values of the parameter are calculated a priori, and, at each change, the index or the address making it possible to access the value of the parameter for example in a table is determined.
- a limited number of sets of parameter values are determined, each set is stored and, for each sub-object, the set to be used is selected, for example by calculating a function of the position giving the address of the game to use.
- the assignment of the specific operations to the calculation units depends on both the type of operation, the sequence, and the calculation units themselves.
- the calculation units are specialized, that is to say that the N results from the same specific operation are calculated by the same calculation unit.
- the fact of having specialized computing units saves time, since the computing unit in charge of this operation can realize a memory access at the beginning of the processing, to retrieve the parameter, then apply the operation N times without having to perform a memory access again.
- this specific operation when at least one specific operation implements at least one parameter, this specific operation is performed by at least one calculation unit having access to a memory unit containing a part of the parameter values, this part being determined according to the specific operations performed by that computing unit.
- this specific operation For storage of these parameter values, there may be different hardware configurations, which will be detailed later.
- each calculation unit may have its own memory, or there may be a memory common to all units, or else the calculation units may be grouped together, and have a memory for each group.
- the value of this parameter is a function of the position of the sub-object and / or the elementary information in the object to be processed.
- the value of the parameter will be fixed for all the object to be treated, while for others, this value will be variable depending on the position.
- the coefficients of an image blur correction filter may be more or less strong depending on whether one is at the center or at the edge of the image.
- the configuration of the computing units may vary.
- the specific operations are performed by chained calculation units.
- the calculation units can be chained "in series" or according to a tree, and the results of the calculations on a basic information to be processed transit from one unit to another.
- Such a configuration is made possible by the fact that the processing is regular, and that the transit of the elementary information can therefore also be done regularly.
- the calculation units can also be placed in parallel, in order to process several basic information simultaneously.
- the calculation units are chained to be able to combine calculation results from different elementary information, for example filters.
- the calculation units are chained according to a one-dimensional chaining. In another embodiment, the calculation units are chained according to at least one circular chaining. This last embodiment makes it possible to obtain uninterrupted processing, since when elementary information has passed through all the calculation units, and has undergone a certain number of specific operations, it is immediately transmitted again to the first calculation unit.
- the chaining also comprises at least one queue.
- This embodiment of the process can be implemented as such.
- the specific operations are performed by computation units chained according to at least one circular chaining for each dimension of the sub-object; the circular chaining (s) for each particular dimension D1 of the sub-object further comprising at least one shared or non-shared queue between the circular chaining (s) for the particular dimension D1 of the subobject.
- the queue is shared between all the circular chaining of the same dimension.
- there is exactly one queue per dimension of the sub-object and each queue is shared between all the circular chaining in the same dimension.
- the specific operations are performed by computation units chained to a given dimension DD of the sub-object by means of at least one circular chaining CCI; said circular chaining CCI further comprising at least one file.
- the method further comprises the step, for at least one specific instruction, for each application of said specific instruction to transmit the result of this application of the specific instruction performed on a UCl unit to the unit.
- the sub-object comprises DSO dimensions
- the specific operations are performed by calculation units chained to a given dimension DD of the sub-object by means of at least one circular chaining CCI; said chaining circular CCI further comprises at least one file; the method further comprises the step, for at least one specific instruction, for each application of said specific instruction - to transmit the result of said application of the specific instruction performed on a computing unit
- the computing units each belong to at least one string, in each of the dimensions of the sub-object.
- the method further comprises the step, for at least two specific instructions, for each application of one of the two specific instructions, to transmit the result of said application the specific instruction performed on a UCl calculation unit. to the calculation unit UC2 or file which follows said calculation unit UCl according to a predetermined predetermined chaining for each specific instruction.
- the chaining used depends in practice on the type of filter performed (vertical or horizontal, for example) by the specific instruction sequence.
- the specific operations are of two types: either they do not implement any chaining, or they systematically implement a chaining, that is to say each time they are performed; in this case, all chaining implemented by the same specific operation by the different processors are in the same dimension.
- the specific operations are performed by computation units chained according to at least one circular chaining; said chaining makes it possible to transmit the result of a specific operation to the next processor or file in the chaining the processor having produced said result.
- the specific operations are performed by computation units chained according to at least one circular chaining; said circular chaining further comprising at least one file; said queue for transmitting the results of specific operations required to calculate at least one other sub-object.
- the invention further comprises the step of grouping in memory the results of specific operations used when subprocessing another sub-object as a function of the relative position of said other sub-object relative to to the said sub-object.
- the method further comprises the step of grouping in at least one queue the results of specific operations used during the subprocessing of another sub-object.
- the method includes the additional step of providing the platform with instructions for keeping in memory or in line at least a portion of these results of specific operations.
- the method comprises the step of assigning the operations specific to the calculation units according to the chaining of the calculation units, and the sequence. This step can also be performed by a compiler located upstream of the platform.
- programmable computing units that is, the sequence of specific operations and / or the assignment of operations specific to the different calculation units may be modified after completion of the component containing these calculation units.
- the order and / or the nature of the specific operations are modifiable.
- the calculation units are programmable, they can be programmed a first time at the time of the realization of the component.
- the specific operations are performed by hardwired computing units according to at least one predetermined sequence of specific operations.
- This embodiment makes it possible, for example, to dispense with the use of an external memory. Indeed, instead of having such a memory containing the sequencing of the operations to be performed for an algorithm, it is possible to wire the calculation units in such a way that the operations are performed in an order corresponding to this algorithm that we want apply to an object.
- the platform on which the processing is performed can have different types of memory that vary, both in terms of capacity and in terms of access rate.
- a fast memory and / or registers can be used to store results of short-term operations, in the case of operations such as filters that require the immediate reuse of certain data.
- at least one specific operation is performed by at least one calculation unit with a limited capacity memory unit for storing elementary information and / or results of specific operations, this memory containing at most Sixteen basic information and / or results of specific operations.
- Fast memories here usually having a limited capacity, it is necessary, in in some cases, also having a larger capacity memory to store more basic information and / or results of specific operations.
- At least one specific operation is performed by at least one calculation unit having access to a communication memory unit, containing at least one elementary information and / or at least one of the specific operation result. from other sub-objects.
- This communication memory is generally used to store elementary information and / or results of specific long-term operations used for processing other sub-objects. Only a portion of the specific operations produce or use such data, and the required throughput is therefore limited.
- the regularity provided by the invention makes it possible to very simply determine what these data are and thus to dispense with a cache memory mechanism, which reduces the complexity and the cost of the platform.
- the communication memory unit has an access rate less than 0.3 * N access / sub-object / specific operation. Such memory, with a relatively slow access rate, will be less expensive than if one wanted to use a memory both fast and large capacity. This is an advantage of the invention.
- the processing platform is such that the memory capacity is reduced, it is necessary to choose a size of sub-objects such that one can correctly apply a treatment.
- the value of Q is set at 1 and the value of N is between 2 and 16.
- the platform is intended to handle photographs taken by the mobile phone, all operations will apply to only one pixel at a time.
- the platform comprises a vector processor, it is possible to have a large number of calculation units.
- This hardware configuration makes it possible, if the computing units are used wisely, to speed up the process of processing an object. For this purpose, in one embodiment, at least one specific operation is performed simultaneously by at least two identical calculation units. The invention then makes it possible, by the regularity of the processing, to make the best use of the processors.
- the specific operations comprise at least one specific calculation operation taken in the group comprising: addition, subtraction, multiplication, application of a correspondence table, the minimum, the maximum, the selection
- At least one specific calculation operation also produces an offset, and / or a saturation and / or a rounding.
- the specific selection calculation operation makes it possible to choose one of at least two data as a function of the value of a third data item.
- the application of a correspondence table is performed by a calculation implementing the entry of the table and a limited number of coefficients.
- the limited number of coefficients is set to 8.
- the specific operations are performed by computation units chained by means of at least one circular chaining CCI; said circular chaining CCI further comprising at least one file; at least one specific instruction IS4 of the specific instruction sequence transmitting the result of a specific instruction IS5 performed on a calculation unit UCl to the calculation unit UC2 or file which follows said calculation unit UCl according to said chaining.
- the specific instruction IS4 transmits, from the queue, to the UCO calculation unit that follows the queue, the result of a specific instruction IS5 performed during a previous subprocessing.
- the queue makes it possible to output the data in the same order as they were entered in the queue.
- N and Q vary according to the achievements. Each of the achievements has different advantages. Thus, in one embodiment N is not multiple of Q. In a variant of this embodiment, Q is equal to the number of specific operations of the sequence obtained by translating the generic sequence of operations.
- N is a multiple of Q. This makes the processing regular.
- N Q. This makes it possible to reduce the amount of memory required for storing temporary results.
- Q> 1 and N Q. This makes it possible to use the Q units of calculation of a vector processor at 100%.
- Q> 1 and N is a multiple of Q. This makes it possible to use the Q units of computation of a vectorial processor at 100%, reducing the number of result of specific operations carried out during the processing of a sub-object and used for processing another sub-object.
- each processor performs all operations of the sequence of specific operations. In one embodiment, all the processors perform the same specific operation at the same time. In another embodiment, all the processors perform the same specific operation successively, which makes it possible to perform recursive filters.
- Memory storage of basic information and results of operations requires the use of relatively simple addressing, so as not to lose too much time when searching for basic information.
- at least a portion of the results of specific operations are stored in memory at an address of the form "base address + offset" or “base address + offset modulo (size of a buffer memory ) ", The offset being constant for all the results of the same specific operation.
- the buffer memory is preferably integrated in one of the memories of the platform used for processing.
- the buffer memory can be, in particular, a queue.
- the base address is changed each time a sub-object is changed in the processing process. In one embodiment, this addressing may be used in particular for the communication data between the sub-objects according to at least one dimension.
- the address calculation is common to all the processors and to a memory delivering groups of elementary information and / or the result of operations specific to the size of a sub-unit. -object can be used.
- each computing unit had its own memory.
- a given address can be relative to several memories, that is to say that a memory address as defined here actually represents the set of memory addresses used by all the calculation units performing the same operation. specific.
- At least a portion of the results of specific operations are stored in memory at a predetermined address for all results of the same specific operation.
- the method defined above is such that the number of calculation units needed to perform the treatment can be relatively small.
- the number of transistors of the platform of processing is less than 10000 per computing unit, including the associated register unit without communication memory.
- the platform is provided, preferably directly, specific formatted data calculated from generic formatted data, these generic formatted data including first data describing at least one sequence of generic operations, the calculation specific formatted data being performed taking into account a basic information flow mode in the platform and specific operations from the generic operations, these specific operations forming the sequence of specific operations to be performed on an object during its processing in the platform.
- Generic operations are operations that apply to logical blocks, that is to abstract entities, without notion of size or shape, composed of basic information, and may constitute all or part of the object.
- generic formatted data is digital data for describing a processing to be performed on an object by a data processing platform, regardless of the platform itself.
- the specific formatted data can be provided directly or indirectly by using a compiler to generate a platform-specific binary from the specific formatted data.
- the generic operations comprise at least one elementary generic operation included in the group comprising: the addition of logic blocks and / or parameters, the subtraction of logic blocks and / or parameters, the calculation of the absolute value of the difference between logical blocks, the multiplication of logical blocks and / or parameters, the maximum among at least two logical blocks and / or parameters, the minimum of at least two logical blocks and / or parameters, the application of a correspondence table, the conditional choice of logical blocks and / or parameters, this choice being made as follows: if a> b we choose c, otherwise we choose d, with a, b, c and d which are logical blocks and / or parameters, the histogram of a block logic, the scaling of a logic block, and an operation producing a block containing at least one coordinate.
- the elementary information is represented by fixed-point numeric values, and wherein the elementary generic operations include offset operations, a saturation operation, and / or at least one elementary generic operation combined with this saturation operation. .
- the object to be processed is an image
- the elementary information is pixels of this image.
- the processing platform is, for example, part of an image capture and / or rendering apparatus
- the operations implement parameters whose values depend on the sequence of operations and / or of the processing platform and / or the object to be processed, these parameter values being related to the characteristics of the optics and / or the sensor and / or the imager and / or the electronics and / or the software of the camera for capturing and / or restoring the image.
- the characteristics may be, for example, fixed intrinsic characteristics for all objects or variables depending on the object, for example noise characteristics that vary according to the gain of a sensor.
- the characteristics may be identical for all the elementary or variable information depending on the absolute position of the elementary information, for example the fuzziness characteristics of the optics.
- the object to be processed is a digitized sound signal and the elementary information is the sound samples of this signal, or else the object to be process is a digital mesh and the elementary information is the spatial and temporal information characterizing each point of the mesh.
- the invention also relates to a processor (s) and memory (s) platform for processing an object (55) consisting of elementary information of the same kind (54, 56, 58, 60, 154, 156, 158, and 160), comprising means for decomposing the object to be treated into at least two sub-objects (50, 51, 52 and 53) consisting of N elementary information each (54, 56, 58, 154, 156, 158), all the sub-objects (50, 51, 52 and 53) having the same number N of elementary information and means for performing at least one sequence of specific operations on the elementary information of each sub-object (50, 51, 52, 53), these processing means further comprising means for performing, for each sub-object, at least N times each specific operation, so as, on the one hand, to involve at least once each elementary information of each sub-object, and secondly to produce N results for each specific operation, the average wherein the processing set is such that at least one specific operation (62) of the sequence of specific operations implements, directly or indirectly, at least once during its N applications
- the object to be processed comprises DO dimensions
- the sub-objects comprise DSO dimensions selected from the DO dimensions of the object
- the means for decomposing the object are such that, according to at least one dimension of the sub-object, object, the sub-objects have no overlap.
- the specific operations are performed by calculation units chained according to at least one circular chaining according to the dimension according to which the sub-objects have no overlap.
- the sub-objects have no overlap in any dimension.
- the platform comprises at least one inter-object communication memory for storing the elementary information and / or results of specific operations carried out thinking of the processing of a sub-object, and used for the processing of another sub-object. -object.
- the platform comprises means for performing a sequence of specific operations comprising only a specific operation implementing the same data during the processing of the object.
- the object comprises DO dimensions
- the elementary information is received in the platform according to a dimension DE of the object and then according to the other dimensions
- the sub-objects comprise DSO dimensions selected from the DO dimensions of the object and comprising the dimension DE
- the platform is such that an inner loop of the sub-objects, included in the processing, is performed according to the dimension DE.
- the object comprises DO dimensions
- the elementary information is received in the platform according to a dimension DE of the object and, according to the other dimensions, the sub-objects comprise the DO dimensions of the object or DO-I dimensions selected from the DO dimensions of the object, the dimension DE not being understood
- the platform is such that an internal loop of travel of the subobjects included in the processing is performed according to the dimension DE.
- the means for breaking down the object to be processed are such that the size of the sub-objects in each dimension is determined according to the size of the object and / or the bit rate of the elementary information received by the platform, and / or the speed of calculation of the platform and / or the size and the speed of at least one memory of the platform.
- the platform comprises Q calculation units calculating the same specific operation simultaneously, and the platform further comprises means for performing the same operation. minus a sequence of specific operations comprising at least one specific operation of selection, this specific selection operation performing the selection of a parameter value among C parameter values at the same time on the Q calculation units in a differentiated manner by processor, according to at least one elementary information and / or at least one specific operation result and / or at least one parameter value.
- the platform comprises Q calculation units calculating the same specific operation simultaneously, and further comprises means for performing at least one sequence of specific operations comprising at least one specific selection operation, this specific selection operation performing the same operation. selecting one of C given at the same time on the Q calculation units in a differentiated manner by processor, as a function of at least one elementary information and / or at least one specific operation result and / or at least one parameter value.
- the platform includes means for performing a sequence of specific operations including at least one specific positional operation, and, the object having DO dimensions, this specific positional operation producing position information according to one of the DO dimensions.
- the platform comprises means for performing a sequence of specific operations comprising at least one specific operation producing a relative position as a function of at least one elementary information and / or at least one specific operation result and / or at least one parameter value.
- the platform comprises Q calculation units for performing at least a portion of the specific operations, Q being equal to N or a submultiple of N.
- the number of computation units Q is different from N, and wherein the processing of the sub-object comprises a single internal loop of N / Q iterations.
- the platform is such that the
- the platform comprises means for the N results from the same specific operation to be calculated by the same calculation unit.
- the platform comprises means, when at least one specific operation implementing at least one parameter, to perform this specific operation, and the platform further comprises a computing unit having access to a memory unit containing a part of the values of the parameters, the platform being such that this part being determined according to the specific operations performed by this calculation unit.
- the platform comprises means so that, when at least one specific operation implements at least one parameter, this parameter is a function of the position of the sub-object in an object to be processed.
- the platform comprises chained calculation units.
- the platform comprises computation units chained in one-dimensional chaining.
- the platform comprises computation units chained according to at least one circular chaining.
- This chaining may, in addition, in one embodiment, include at least one queue.
- the platform comprises computation units chained according to at least one circular chaining for each dimension of the sub-object.
- the circular chaining (s) for each particular dimension D1 of the sub-object also comprise at least one shared queue, or not between the circular chaining (s) for the particular dimension D1 of the under object.
- the platform comprises computation units chained according to at least one determined dimension DD of the sub object by means of a circular chaining CCI.
- the circular chaining further comprises at least one file; and the platform is such that for each application of a specific instruction, the result of this application in a first calculation unit UCl is transmitted to a calculation unit UC2 or file which follows said first calculation unit UCl according to said chaining.
- the platform comprises at least one memory for storing the results of specific operations used during the sub-processing of another sub-object according to the relative position of said other sub object with respect to said sub object.
- the platform comprises computation units chained according to at least one circular chaining, and means for allocating the operations specific to the calculation units according to the chaining of the calculation units and the sequence
- the platform comprises means for the order and / or the nature of the specific operations to be modifiable (s).
- the platform includes hardwired computing units for performing the specific operations according to at least one predetermined sequence of specific operations.
- the platform comprises at least one calculation unit with a limited capacity memory unit for storing elementary information and / or specific operation results, this memory containing at most sixteen basic information and / or results. specific operations.
- the platform comprises at least one computing unit having access to a communication memory unit, containing elementary information and / or results of specific operations from other sub-objects.
- the platform is such that the communication memory unit has an access rate less than 0.3 * N access / sub-object / specific operation.
- the platform is in particular integrated in a mobile phone, and comprises means for the value of Q to be set to 1 and for the value of N to be between 2 and 16.
- the platform comprises at least two identical calculation units simultaneously performing at least one specific operation.
- the platform includes means for at least a portion of the results of specific operations to be stored in memory at an address of the form.
- Base address + offset or “base address + modulo offset (buffer size)", the offset being constant for all results of the same specific operation.
- the platform includes means for changing the base address each time a sub-object is changed in the processing process.
- the platform includes means for at least a portion of the results of specific operations to be stored in memory at a predetermined address for all results of the same operation.
- the platform comprises at least one calculation unit provided with a memory, and in which the number of transistors is less than 10,000 per calculation unit, including the associated memory unit.
- the platform comprises means for receiving, preferably directly, specific formatted data computed from generic formatted data, these generic formatted data comprising first data describing at least one generic operation sequence, the calculation specific formatted data being performed taking into account a basic information pathway mode in the platform and specific operations derived from the generic operations, these specific operations forming a sequence of specific operations, and the platform comprising means for performing this sequence of specific operations on an object.
- the platform comprises means for processing an object consisting of an image, the elementary information being pixels of this image.
- the platform is part of an image capture and / or rendering apparatus, the operations implementing parameters whose values depend on the sequence of operations and / or that platform and / or the object to be processed, these parameter values being related to the characteristics of the optics and / or the sensor and / or the imager and / or the electronics and / or the software of the capture apparatus and / or image restitution.
- the platform comprises means for processing an object consisting of a digitized sound signal, the elementary information being the sound samples of this signal.
- the platform comprises means for processing an object consisting of a digital mesh
- the elementary information is the spatial and temporal information characterizing each point of the mesh.
- the invention also relates to an object treated by a treatment method according to the method described above.
- FIGS. 1a, 1b, 1c Id, 1 e and 1e already described, represent examples of decomposition of an image into sub-objects, in accordance with the invention
- FIG. 2 represents a device using a method according to the invention
- FIG. 3 represents an example of a sequence of generic operations applied to several logical blocks and a parameter
- FIG. 4 represents the structure of specific formatted data provided to a platform, in a method according to the invention
- FIG. 5 represents the application of an operation specific to an object
- FIGS. 6, 7 and 8 show different architectures of platforms that can process objects according to a method according to the invention
- FIGS. 9a, 9b and 9c show examples of chaining processors in a platform according to the invention.
- the device shown in FIG. 2 is used to process an image 22, this image being a set of pixels represented by at least one numerical value.
- generic data formatted data 12 is provided to digital data processing means 12.
- This processing means is for example a compiler.
- the generic formatted data provided by a method according to the invention, includes first and second data that describe generic operation sequences and that provide the relative positions of the logical blocks involved in these generic operations. These first and second data will be illustrated with Figure 3.
- the processing means 10 also receives, as input, a mode of travel 24 of the elementary information in the platform determined according to the characteristics of a processing platform 20, such as a camera for capturing or restoring images.
- the processing means 10 provides the processing platform 20 with specific formatted data 18.
- the specific formatted data contains different types of data, such as data concerning the organization of the pixels in the platform memory, the order in which the pixels are processed by the platform or the grouping of the operations performed by the platform.
- FIG. 3 show an example of generic formatted data in the form of a sequence of generic operations applied to a logic block B1. This sequence comprises three generic operations. The columns of the array represent in order: the rank of the operation in the sequence, - the name of the generic operation, the logical block (output) on which the result of the generic operation is written.
- the first input (input 1) of the generic operation which may be a logic block or a parameter, the relative position of the logic block to be used with respect to the input logic block 1, if any, - the second input (input 2) of the generic operation, which may also be a logic block or a parameter, and the relative position of the logic block to be used with respect to the input logic block 2, if any.
- the information in the "relative position" columns is the information present in the second data provided to a processing means by a method according to the invention.
- the second data relates to the relative position, according to at least one dimension of the object, in particular spatial and / or temporal, of the blocks and / or parameters with respect to each other, and / or relating to the relative scale, according to at least one dimension of the particular spatial and / or temporal object, logical blocks and / or parameters with respect to each other.
- this information is in the form "left” and "right” to be understandable, but in fact, in generic formatted data, it can also be encoded by numeric values such as (0; 1) and / or by functions such as
- a generic operation makes it possible to obtain a logical block consisting of the absolute position according to one dimension of the object
- another generic operation known as indirection makes it possible to obtain a block by displacement and / or scaling. indicated by a second block from a third block.
- Table 4 is only an example of coding, the first data and second data can be encoded in various ways in tabular form, but also in symbolic form, in graphic form or in any other form.
- the first logic block used in this sequence of operations is a logic block B1 (51).
- the first generic operation is an addition (52) between the left-shifted logic block B1 (51g), and the right-shifted logic block B (5Id).
- the second operation (54) is a transformation of the block B2 (53) with respect to a table. This operation therefore has the B2 block (53) and a Param1 parameter (55) which represents the modification table.
- the third and last operation (57) of this sequence is a multiplication of logical blocks.
- the logic block B4 (58) is thus the block obtained at the end of the sequence of generic operations.
- the generic formatted data in the example in Table 4 are independent of the platform, the decomposition of the object into subobjects, the way in which the elementary information of the object is browsed, and the order in which the basic information will be processed in the platform, as well as the organization in memory. Indeed, the generic formatted data of table 1 can be translated in various ways into specific formatted data or into code for the platform, for example, without the list being limiting, according to the following translations.
- a first example of a translation that is not optimal in terms of memory and computation time makes it possible to illustrate a simple translation without going through a decomposition into sub-objects:
- a second example of translation shows that the size of the memory used can be decreased without changing the generic formatted data. Indeed, in the first example, 4 physical blocks of size close to the image are used. Only 2 physical blocks can be used using the same memory for BP2, BP3 and BP4. We obtain the following translation:
- a third example of translation shows that we can reduce the calculation time without changing the data generic formatted.
- two physical blocks of size close to the image are used, but the physical block BP2 is written 3 times entirely, the physical block BP1 is read out twice, and the physical block BP2 is read out 2 times entirely. .
- a fourth example more particularly adapted to a scalar processor with cache, the result is written in the same memory area as the input. This makes it possible to further reduce the size of the memory and to make memory access local, which is very favorable in the case of a cache memory or a paged memory. In this case a sub-object consists of one pixel. We thus obtain the following translation:
- a fifth example of a translation is particularly suitable for a signal processing processor with a small fast memory and a large slow memory, each sub-object is a rectangle, for example 32x32, or any other value that maximizes the use of the fast memory. rectangles being joined. We thus obtain the following translation:
- the sub-objects are traveled from left to right and then from top to bottom
- each sub-object is a rectangle for example 64 horizontal pixels or any other value equal to the size of a vector that the platform knows how to process and store. This translation does not require any memory because a vector is processed at a time. We thus obtain the following translation:
- each line create a vector VO containing on the right the 2 left pixels of the line Extract of VO and Vl, the vector V2 corresponding to the two pixels of right of VO and the pixels of left of Vl excluding the 2 pixels of right of VO; add V1 and V2 to obtain V2, apply the table to each pixel of V2 to obtain V2, Extract from VO and V1, the vector V3 corresponding to the right pixel of VO and the left pixels of V1 excluding the pixel from the right of VO; copy Vl into VO for the next iteration; multiply V2 by V3 to obtain V2, store the result V2 in the current output physical block.
- the third, fourth, fifth and sixth examples above correspond to embodiments according to the invention for various platforms with different architectures in particular in terms of memory and parallelism.
- the invention makes it possible to: reduce the size of the code by using only one loop, and / or
- the examples produce a smaller image than the input image. You can easily, if necessary, get an output image of the same size as the input image by adding code at the beginning and end of each line to duplicate the edge pixel.
- FIG. 4 represents the structure of the formatted data specific to the output of a processing means 10, these data being intended to be supplied to a processing platform 20, according to a method according to the invention.
- the specific formatted data is computed by processing means from generic formatted data provided to the processing means and a mode of browsing of the elementary information in the platform determined by this processing means.
- the generic formatted data includes first data 36 containing data 38 describing at least one generic operation or sequence of operations to be performed by the processing means.
- the data Generic formats also include second data 40 relating to the relative position and scale of logical blocks relative to one another for generic operations involving at least two logical blocks. From these generic formatted data and the browse mode 34, the processing means provides data 42 relating to the specific operations, and data 44 relating to the loops. These data 42 and 44 are part of the specific formatted data 30.
- FIG. 5 illustrates the application of an operation or operation specific to an object.
- the object 55 is divided into four sub-objects 250, 251, 252 and 253. Each of these sub-objects is composed of six elementary information items.
- the operation 262 This operation is applied six times on each sub-object (262a, 262b, 262c, 262d, 262e and 262f) so as to produce six results. (264).
- the operation 262 involves elementary information of another sub-object.
- the application 262a involves the elementary information 254 and 256
- the application 262b involves the elementary information 256 and 258
- the application 262c involves the elementary information 258 and 260, elementary information 260 belonging to sub-object 252.
- application 262d involves elementary information 154 and 156
- application 62e involves elementary information 156 and 158
- application 62f involves the elementary information 158 and 160, the elementary information 160 belonging to the sub-object 252.
- the processing platform comprises five processors chained in one dimension, that is to say that the result of the calculations coming out of the processor Proc A is used at the input of the processor ProcB, and thus right now.
- the basic information coming out of the ProcE processor is applied to the input of the ProcA processor.
- Each of the processors is provided with a memory unit of limited capacity, denoted MemA to MemE. This memory unit is intended to store the values of parameters that are useful for the specific operations performed by the processor, or elementary information or results of operations that are intended to be reused quickly by the processor.
- the processing consists in applying to the elementary information comprising the object a sequence of eight operations denoted OP1 to 0P8.
- OP1 to 0P8 In order to process the object, it must be broken down into sub-objects of N elementary information each.
- OP9 and OP10 are created by processing means located upstream of the platform, so that the number of specific operations to be performed on each sub-object is a multiple of the number of available processors.
- each operation is assigned to a processor.
- the processor A realizes OP1 and 0P6, the processor B realizes 0P2 and 0P7, the processor C realizes 0P3 and 0P8,
- the processor D realizes 0P4 and 0P9, and
- the processor E realizes 0P5 and OP10.
- Each processor executes a set of instructions (InsA to InsE) corresponding to the specific operations that have been assigned to it. This assignment also depends on the parameters stored in the limited capacity memories. For example, if OP1 is a multiplication by 2, the memory MemA will contain the number 2.
- Each line represents one of the 10 specific operations OP1 to OP10.
- Each column represents one of the elementary information IE1 to IE5 composing each of the sub-objects to be processed.
- This notation IE1 to IE5 is formal; it does not necessarily correspond to a spatial or temporal reality. Indeed, certain specific operations have as their effect of moving elementary information.
- the information IE1 processed by the specific operation 0P2 may not be the result of the specific operation OP1 applied to the information IE1, but the result of this specific operation OP1 applied to the information IE2, for example if the specific operation OP1 consists of a shift to the left.
- Each box in this table contains the name of the processor that performs the specific operation, as well as the time when that specific operation is performed during processing.
- this table represents only part of the treatment. It is assumed here that all the results of specific operations required have been calculated beforehand in the processing. Thus, it can be seen that at the instant Tl, the processor
- ProcA performs the operation OP1 on the first information IE1 of the sub-object 1.
- the other processors are carrying out other operations not shown on this table.
- each of the processors performs an operation on one of the information of the sub-object 1.
- the processor ProcA performs, from T6, the operation 0P6.
- This sequencing is obtained by the one-dimensional circular chaining of the processors.
- Basic information can therefore transit from one computing unit to another.
- the elementary information IE1 passes through all the processors to "undergo" the specific operations OP1 to 0P5, then it goes back to the processor ProcA to restart a cycle and "undergo" the operations 0P6 to 0P7.
- the basic information IE1 initially will not necessarily be the information IE1 at all stages.
- the platform contains five processors connected to a common memory.
- a common memory such as a vector processor ("Single Instruction Multiple Data" or SIMD in English).
- SIMD Single Instruction Multiple Data
- each processor is individually connected to a small memory that can contain parameters such as a correspondence table T.
- each processor performs all the specific operations.
- all processors receive the same set of INS instructions.
- one of the operations consists of using a table to modify one or more elementary information items.
- each of the processors has access to its own table, all the tables being identical.
- each memory is shared by a group of processors.
- all the processors share the same memory and simultaneously obtain the same parameter; in this case, the application of a correspondence table must be performed by calculation using one or more parameters allowing, for example, to calculate a polynomial.
- the platform comprises a vector processor composed of five processors connected to a common memory, similar to the vector processor notably present in a computer of personal type (PC) .. They are also all connected to a small memory that can contain parameters, including a correspondence table.
- each processor performs all the specific operations.
- all processors receive the same set of INS instructions with data describing all the specific operations to be performed.
- the operation 0P4 is performed by the processors ProcA to ProcE respectively at times T4 to T8. If it is assumed that the operation 0P5 also uses a table, we will have the same way: the operation 0P5 is performed by the processors ProcA to ProcE respectively at times T9 to T13.
- FIG. 9a shows an exemplary embodiment of a platform, comprising several circular chaining according to one dimension of the sub-object.
- the object is a two-dimensional image
- the sub-object has 4 basic information
- the platform has 4 processors arranged in a grid of 4 * 1 processors corresponding to a rectangle of 4 processors horizontally and 1 processor vertically.
- the processors are called from left to right: Pl, P2, P3, and P4.
- the method also implements in this example 2 files:
- a horizontal line FHa is connected at the input to an output of P4 and at the output to an input of the processor P1.
- An output of P1 is connected to an input of P2.
- An output of P2 is connected to an input of P3, and an output of P3 is connected to an input of P4.
- a vertical queue FVa is connected at the input to an output of P1, P2, P3 and P4 and at the output to an input of the processor P1, P2, P3 and P4.
- the sequence of specific operations can implement an arbitrary number of FH horizontal filters while using the 4 processors at 100%. For example, in the case of a specific operation 0S2 performing the calculation of a filter consisting of an addition between the result of a specific operation OS1 and the result of the same specific operation
- the result of the operation OS1 of the processor P3 is transferred to the processor P4 to be used by 0S2 on P4 in combination with the result of OS1 on P4, the result of the operation OS1 of the processor P2 is transferred to the processor P3 to be used by 0S2 on P3 in combination with the result of OS1 on P3; the result of operation OS1 of processor P1 is transferred to processor P2 to be used by O2 on P2 in combination with the result of
- Another specific operation 0S3 of the sequence can implement another horizontal filter, the queue allows to recover the data in the right order.
- the sequence of specific operations can implement an arbitrary number of FV vertical filters while using the 4 processors at 100%,
- sequence of specific operations can implement an arbitrary number of non-separable filters according to the 2 horizontal and vertical dimensions FVH while using the 4 processors at 100%; for example a 3x3 non-separable filter applied on 4 results of a specific operation 0S4, can twice solicit FVa then six times FHa, to obtain the 8 sets of 4 previously calculated results of 0S4 to be combined with the result set of OS4 of the current subobject; for example, these non-separable filters can be used in combination with vertical and / or horizontal filters, the 2 files make it possible to recover the data in the right order.
- the sequence of specific operations is such that at least two specific operations distinct from the sequence each produce at least one time during their N applications, a result used for the processing of data. another sub-object.
- the result used for the processing of another sub-object passes through the file (s).
- FIG. 9b shows a second example, in which the object is a two-dimensional image, the sub-object comprises 4 elementary information, and the platform comprises 4 processors arranged according to a grid of corresponding 2 * 2 processors. to a rectangle of 2 processors horizontally and 2 processors vertically.
- the processors are called from left to right: P4 and P5 on the top line and P6 and P7 on the bottom line.
- the method also implements in this example 2 files:
- a horizontal line FHb is connected as an input to the output of P3 and P6 and as an output to the input of P1 and P4
- a vertical queue FVb is connected as an input to an output of P4 and P5 and as an output to an input of the processor P6 and
- the sequence of specific operations can implement an arbitrary number of vertical and / or horizontal and / or non-separable filters while using the 4 processors at 100%.
- the platform comprises a single processor P8, connected to a horizontal queue FHc and to a vertical queue FVc. These two files can be used by the processor to store results of specific operations intended to be reused later.
- the sequence of specific operations can implement an arbitrary number of vertical and / or horizontal and / or non-separable filters while using the 100% processor.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Image Processing (AREA)
- Hardware Redundancy (AREA)
- Multi Processors (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR0553945A FR2895102B1 (fr) | 2005-12-19 | 2005-12-19 | Procede pour traiter un objet dans une plateforme a processeur(s) et memoire(s) et plateforme utilisant le procede |
PCT/FR2006/051390 WO2007071884A2 (fr) | 2005-12-19 | 2006-12-19 | Procede pour traiter un objet dans une plateforme a processeur(s) et memoire(s) et plateforme utilisant le procede |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1964053A2 true EP1964053A2 (fr) | 2008-09-03 |
Family
ID=37307163
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP06847180A Ceased EP1964053A2 (fr) | 2005-12-19 | 2006-12-19 | Procede pour traiter un objet dans une plateforme a processeur(s) et memoire(s) et plateforme utilisant le procede |
Country Status (7)
Country | Link |
---|---|
US (1) | US8412725B2 (fr) |
EP (1) | EP1964053A2 (fr) |
JP (1) | JP5025658B2 (fr) |
KR (1) | KR101391498B1 (fr) |
CN (1) | CN101375311A (fr) |
FR (1) | FR2895102B1 (fr) |
WO (1) | WO2007071884A2 (fr) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2002317219A1 (en) * | 2001-07-12 | 2003-01-29 | Do Labs | Method and system for modifying a digital image taking into account its noise |
US8515052B2 (en) | 2007-12-17 | 2013-08-20 | Wai Wu | Parallel signal processing system and method |
US8755515B1 (en) | 2008-09-29 | 2014-06-17 | Wai Wu | Parallel signal processing system and method |
NZ603057A (en) | 2010-03-18 | 2014-10-31 | Nuodb Inc | Database management system |
US20150070138A1 (en) * | 2012-07-06 | 2015-03-12 | Alan Haddy | Detection of buried assets using current location and known buffer zones |
US9501363B1 (en) | 2013-03-15 | 2016-11-22 | Nuodb, Inc. | Distributed database management system with node failure detection |
US11176111B2 (en) | 2013-03-15 | 2021-11-16 | Nuodb, Inc. | Distributed database management system with dynamically split B-tree indexes |
US10740323B1 (en) | 2013-03-15 | 2020-08-11 | Nuodb, Inc. | Global uniqueness checking in distributed databases |
US10037348B2 (en) | 2013-04-08 | 2018-07-31 | Nuodb, Inc. | Database management system with database hibernation and bursting |
US10255547B2 (en) * | 2014-12-04 | 2019-04-09 | Nvidia Corporation | Indirectly accessing sample data to perform multi-convolution operations in a parallel processing system |
US10884869B2 (en) | 2015-04-16 | 2021-01-05 | Nuodb, Inc. | Backup and restore in a distributed database utilizing consistent database snapshots |
US10067969B2 (en) | 2015-05-29 | 2018-09-04 | Nuodb, Inc. | Table partitioning within distributed database systems |
US10180954B2 (en) | 2015-05-29 | 2019-01-15 | Nuodb, Inc. | Disconnected operation within distributed database systems |
SG11202001323XA (en) | 2017-08-15 | 2020-03-30 | Nuodb Inc | Index splitting in distributed databases |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4839801A (en) * | 1986-11-03 | 1989-06-13 | Saxpy Computer Corporation | Architecture for block processing computer system |
US4939642A (en) * | 1989-02-01 | 1990-07-03 | The Board Of Trustees Of The Leland Stanford Jr. University | Virtual bit map processor |
DE69421103T2 (de) * | 1993-01-22 | 2000-06-08 | Matsushita Electric Industrial Co., Ltd. | Programmgesteuertes Prozessor |
JP3316901B2 (ja) * | 1993-01-22 | 2002-08-19 | 松下電器産業株式会社 | データ分配回路 |
JP3305406B2 (ja) * | 1993-04-26 | 2002-07-22 | 松下電器産業株式会社 | プログラム制御のプロセッサ |
WO1999052040A1 (fr) * | 1998-04-08 | 1999-10-14 | Stellar Technologies, Ltd. | Architecture pour traitement graphique |
FR2827459B1 (fr) * | 2001-07-12 | 2004-10-29 | Poseidon | Procede et systeme pour fournir a des logiciels de traitement d'image des informations formatees liees aux caracteristiques des appareils de capture d'image et/ou des moyens de restitution d'image |
US7532766B2 (en) * | 2001-07-12 | 2009-05-12 | Do Labs | Method and system for producing formatted data related to geometric distortions |
AU2002317219A1 (en) * | 2001-07-12 | 2003-01-29 | Do Labs | Method and system for modifying a digital image taking into account its noise |
US6980980B1 (en) * | 2002-01-16 | 2005-12-27 | Microsoft Corporation | Summary-detail cube architecture using horizontal partitioning of dimensions |
US7158141B2 (en) * | 2002-01-17 | 2007-01-02 | University Of Washington | Programmable 3D graphics pipeline for multimedia applications |
US20070285429A1 (en) * | 2003-12-22 | 2007-12-13 | Koninklijke Philips Electronic, N.V. | System for Generating a Distributed Image Processing Application |
-
2005
- 2005-12-19 FR FR0553945A patent/FR2895102B1/fr not_active Expired - Fee Related
-
2006
- 2006-12-19 JP JP2008545068A patent/JP5025658B2/ja not_active Expired - Fee Related
- 2006-12-19 WO PCT/FR2006/051390 patent/WO2007071884A2/fr active Application Filing
- 2006-12-19 CN CNA2006800530232A patent/CN101375311A/zh active Pending
- 2006-12-19 EP EP06847180A patent/EP1964053A2/fr not_active Ceased
- 2006-12-19 US US12/158,129 patent/US8412725B2/en active Active
-
2008
- 2008-07-18 KR KR1020087017736A patent/KR101391498B1/ko active IP Right Grant
Non-Patent Citations (1)
Title |
---|
See references of WO2007071884A2 * |
Also Published As
Publication number | Publication date |
---|---|
KR20080080398A (ko) | 2008-09-03 |
CN101375311A (zh) | 2009-02-25 |
WO2007071884A2 (fr) | 2007-06-28 |
US8412725B2 (en) | 2013-04-02 |
JP2009524123A (ja) | 2009-06-25 |
FR2895102B1 (fr) | 2012-12-07 |
WO2007071884A3 (fr) | 2007-08-16 |
US20080320038A1 (en) | 2008-12-25 |
JP5025658B2 (ja) | 2012-09-12 |
KR101391498B1 (ko) | 2014-05-07 |
FR2895102A1 (fr) | 2007-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2007071884A2 (fr) | Procede pour traiter un objet dans une plateforme a processeur(s) et memoire(s) et plateforme utilisant le procede | |
WO2007071883A2 (fr) | Procede et systeme de traitement de donnes numeriques | |
CN111194458B (zh) | 用于处理图像的图像信号处理器 | |
EP0248729B1 (fr) | Dispositifs de calcul de transformées cosinus monodimensionnelles, et dispositif de codage et dispositif de décodage d'images comportant de tels dispositifs de calcul | |
TW202134997A (zh) | 用於對影像進行去雜訊的方法、用於擴充影像資料集的方法、以及使用者設備 | |
EP1523730B1 (fr) | Procede et systeme pour calculer une image transformee a partir d'une image numerique | |
Liu et al. | Tape: Task-agnostic prior embedding for image restoration | |
Henz et al. | Deep joint design of color filter arrays and demosaicing | |
EP0206847B1 (fr) | Dispositifs de calcul de transformées cosinus, dispositif de codage et dispositif de décodage d'images comportant de tels dispositifs de calcul | |
EP0369854B1 (fr) | Procédé et circuit de traitement par bloc de signal bidimensionnel d'images animées | |
BE897441A (fr) | Calculateur associatif permettant une multiplication rapide | |
FR3091375A1 (fr) | Instruction de chargement-stockage | |
FR2782592A1 (fr) | Dispositif et procede de compression de donnees d'images recues a partir d'un capteur d'images a configuration bayer, et systeme utilisant ce dispositif | |
FR2588142A1 (fr) | Systeme permettant le traitement a haute vitesse par convolutions de donnees d'image. | |
WO2007071882A2 (fr) | Procede pour fournir des donnees a un moyen de traitement numerique | |
Mazumdar et al. | A hardware-friendly bilateral solver for real-time virtual reality video | |
CN116547694A (zh) | 用于对模糊图像去模糊的方法和系统 | |
Ma et al. | Searching for fast demosaicking algorithms | |
CN113298740A (zh) | 一种图像增强方法、装置、终端设备及存储介质 | |
FR2823050A1 (fr) | Dispositif implementant conjointement un post-traitement et un decodage de donnees | |
EP2901411B1 (fr) | Dispositif de decomposition d'images par transformee en ondelettes | |
US9779470B2 (en) | Multi-line image processing with parallel processing units | |
EP2221727A1 (fr) | Système et procédé de traîtement de données numériques | |
Sambamurthy et al. | Scalable intelligent median filter core with adaptive impulse detector | |
Liang et al. | Pixel-wise exposure control for single-shot HDR imaging: A joint optimization approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20080606 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR |
|
17Q | First examination report despatched |
Effective date: 20081014 |
|
111Z | Information provided on other rights and legal means of execution |
Free format text: AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR Effective date: 20100420 Free format text: AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR Effective date: 20090813 |
|
111Z | Information provided on other rights and legal means of execution |
Free format text: AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR Effective date: 20090813 Free format text: AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR Effective date: 20100420 Free format text: AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR Effective date: 20100906 |
|
APBK | Appeal reference recorded |
Free format text: ORIGINAL CODE: EPIDOSNREFNE |
|
APBN | Date of receipt of notice of appeal recorded |
Free format text: ORIGINAL CODE: EPIDOSNNOA2E |
|
APBR | Date of receipt of statement of grounds of appeal recorded |
Free format text: ORIGINAL CODE: EPIDOSNNOA3E |
|
APAF | Appeal reference modified |
Free format text: ORIGINAL CODE: EPIDOSCREFNE |
|
111Z | Information provided on other rights and legal means of execution |
Free format text: AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR Effective date: 20090813 |
|
DAX | Request for extension of the european patent (deleted) | ||
111Z | Information provided on other rights and legal means of execution |
Free format text: AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR Effective date: 20090813 |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: GOPRO TECHNOLOGY FRANCE SAS |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: GOPRO, INC. |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R003 |
|
APBT | Appeal procedure closed |
Free format text: ORIGINAL CODE: EPIDOSNNOA9E |
|
APAM | Information on closure of appeal procedure modified |
Free format text: ORIGINAL CODE: EPIDOSCNOA9E |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED |
|
18R | Application refused |
Effective date: 20171115 |