WO2007071883A2 - Procede et systeme de traitement de donnes numeriques - Google Patents
Procede et systeme de traitement de donnes numeriques Download PDFInfo
- Publication number
- WO2007071883A2 WO2007071883A2 PCT/FR2006/051389 FR2006051389W WO2007071883A2 WO 2007071883 A2 WO2007071883 A2 WO 2007071883A2 FR 2006051389 W FR2006051389 W FR 2006051389W WO 2007071883 A2 WO2007071883 A2 WO 2007071883A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- specific
- sub
- operations
- platform
- specific operations
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/445—Exploiting fine grain parallelism, i.e. parallelism at instruction level
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G06F8/456—Parallelism detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/448—Execution paradigms, e.g. implementations of programming paradigms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/448—Execution paradigms, e.g. implementations of programming paradigms
- G06F9/4494—Execution paradigms, e.g. implementations of programming paradigms data driven
Definitions
- the present invention provides a method and system for processing generic formatted data to provide formatted data specific to a processing platform.
- the specific formatted data can be provided directly or indirectly by using a compiler to generate platform-specific binary code from the specific formatted data.
- object processing algorithms are typically written in a known language, such as C, assembler, or VHDL, and are then compiled or translated into binary form before being used by a processing platform. .
- the compilers used are compilers specific to a type of language.
- the C code is translated to be understandable, for example, for a personal computer type PC or MAC type.
- This compilation depends on the platform for which the algorithm is compiled.
- the processors of these two platforms having similar characteristics, it is possible to write processing algorithms compatible with both platforms.
- a vector processor in particular of the SIMD type (Single Instruction Multiple Data) must process data grouped by vectors of different sizes according to the platform and, the vectors not being expressible in a platform-independent way when using a language such as the C language, it is therefore necessary to code in assembler in a manner dependent on the platform.
- SIMD Single Instruction Multiple Data
- the physical blocks are stored in memory, which results in an intensive use of the memory and a lot of time spent writing and reading intermediate data from the memory, and
- the object of the invention is therefore to propose a data processing method such that it can automatically optimize for several platforms, with a high speed of execution, a low memory usage and a small code size, any algorithm of object processing composed of a large number of different elementary information, coded in a language adapted to this type of algorithm.
- the invention notably makes it possible to accelerate the placing on the market of hardware and software for processing data, in particular images, by automatically and quickly obtaining an optimized implementation for various platforms of any algorithm, and by allowing a modification of the algorithm as late as possible. In the case of an image capture device, for example, this late modification makes it possible to quickly adapt to a new sensor whose characteristics, notably noise, which increases with miniaturization, evolve very quickly.
- the method comprises the following steps:
- the step of translating the sequence of generic operations into specific operations depends on the platform, the step of determining the set of loops necessary for processing according to the topology of the object, and independently of the first data, the step of calculating the specific formatted data including the sequence of specific operations and the loops thus determined, and allowing, directly or indirectly, to process the object, according to the generic formatted data, the sequence being optimized for the platform in terms of code size and / or memory size, and / or computation time .
- the specific formatted data includes the specific sequence of operations thus determined.
- sequence of generic operations used make it possible to overcome the dependence of the loops with operations and with the operations graph.
- sequence of generic operations used does not implement tables or index of access to a table element, and in particular the method or system according to the invention does not include the step of modifying the organization of tables and / or modify the operations using the tables.
- the invention thus makes it possible to obtain directly, without the need for specific optimization of the loops for each sequence of operations, nor use of a graph, an optimal code which uses the Q processors at 100%. This is a particularly difficult feature to obtain.
- the invention also makes it possible to greatly reduce the use of the memory by making it possible to process the elementary information according to a suitable path mode in order to process the data one sub-object at a time, whatever the size of the sub-object , which avoids storing in the case of an image a large number of lines before starting the processing on a block.
- the objects are two-dimensional horizontal and vertical monochromatic images, the information elementary elements being represented by a single numerical value
- the sequence of generic operations is the following: o apply a vertical filter Fl 3 * 1 followed by a horizontal filter F2 1 * 3
- the specific operation sequence is translated into the sequence of following specific operations: o store in Rl a subobject obtained from an input queue o calculate Fl. Cl * Rl and store the result in R2 o perform UP (Rl) and store the result in Rl o calculate Fl. C2 * Rl and store the result in R2 o perform UP (Rl) and store the result in Rl o calculate Fl.
- Fl. Cl, Fl. C2, Fl. C3 being parameters corresponding to the coefficients of the filter Fl
- F2.C1, F2.C2, F2.C3 being parameters corresponding to the coefficients of the filter F2
- LEFT (R1) can be implemented using a chaining according to the horizontal dimension including a FiIeH queue as described below; for example: o If the sub-objects are composed of 4 pixels arranged horizontally and a register contains 4 data from left to right Rl.1, Rl.2, Rl.3 and Rl.4: LEFT (Rl) means to write Rl .4 in FiIeH, write Rl .3 in Rl.4, write Rl.2 in Rl.3, write Rl.1 in Rl.3, for the first sub-object of a line Rl.1 is unchanged otherwise Rl .1 receives a data read in FiIeH - UP (R2) can be implemented using a chaining according to the vertical dimension including a FiIeV file as described below o If sub-objects are composed of 4 pixels arranged horizontally and a register contains 4 data from left to right Rl.1, Rl.2, Rl.3 and Rl.4: UP (Rl) means to write Rl .4 in
- the translation of the sequence of generic operations into a sequence of specific operations is independent of the browse mode as defined below. It is then possible to determine a mode of travel (24) of the basic information in the platform according to the architecture of this platform (22) and according to the topology of the object and independently of the first data, the determination of this mode of travel comprising the choice and / or calculation: of a grouping of elementary information into sub-objects, each comprising a number N of basic information, determined according to the platform, the processing in the platform of starting periodically sub-processing, which consists of applying the sequence of specific operations on one of the sub-objects, the shape and overlap of the sub-objects, determined according to the platform, the order of processing of sub-objects, determined according to the platform.
- the number of iterations depends on the size of the image.
- all the loops required for processing have been determined according to the architecture of the platform and according to the topology of the object, and independently of the first data.
- the loops can be used for any of the following generic operation sequences:
- a loop may be in particular without the list being limiting: a loop executed a certain number of times, a loop executing as long as a condition is verified, a loop executing until a condition is verified, generally an iterative execution related to one or more exit conditions of the loop.
- a queue makes it possible to transmit and / or store elementary information or results of specific operations.
- a queue may include or use a memory.
- a queue can be implemented using one or more FIFO processes ("First in first out").
- a queue has at least one entry and at least one exit.
- a queue may be functionally connected by any means to an input computing unit and an output computing unit.
- a queue may also be functionally connected by any means to PR input calculation units and PR output calculation units, in which case the queue behaves like PR files each connecting an input calculation unit with a calculation unit. output.
- a queue makes it possible to independently manage several streams of data, each stream being associated with a specific specific operation.
- a queue uses at least one unit of memory making it possible to store, for each stream, an identical number NF of data.
- NF is determined according to the relative disposition of the sub-objects and the mode of travel, so that NF-I sub-objects are processed between the processing of a sub-object producing a datum and the processing of the sub-object. object using the data.
- a chaining comprising calculation units and a queue comprises a mechanism for managing the boot: the queue is initialized regularly, for example at the beginning of each line if the queue is part of a horizontal chaining and the object is an image; as long as the queue does not understand NF data, the processor that follows the queue in the chaining takes as input the data that it sends out; then, the processor that follows the queue in the chaining takes the oldest data in the queue as input and removes it from the queue.
- the queue makes it possible to output the data in the same order as they were entered in the queue.
- the circular chaining is unidirectional.
- the circular chaining is such that there is a single input link and a single output link per calculation unit.
- generic formatted data is digital data for describing a processing to be performed on an object by a data processing platform, regardless of the platform itself.
- the treated objects correspond to a set of elementary information of the same nature; these objects are, for example, images, digitized sounds, video or simulation data.
- the objects and sub-objects, as well as the logical blocks have several dimensions.
- the dimensions of the subobjects and logical blocks correspond to all or part of the dimensions of the object.
- the dimensions can be of various natures, in particular: -spatial, for example a distance, an angle or a course in a mesh, -temporal,
- -frequency for example a color, a frequency, a frequency band -a phase, a decomposition according to a vector space base, for example a decomposition in wavelets, or a decomposition in heavy weights, low wt.
- a raw fixed image with 2 dimensions, each corresponding to distances, the pixels being each provided with a color, for example red, green, or else blue,
- an image with dimensions of distance and / or of angle and / or temporal, and / or frequency
- an object with one or more dimensions.
- the elementary information of an object may have a position and / or a scale, in particular absolute space and / or time and / or frequency, but also according to at least one dimension of the object as well as in any other space, especially a space consisting of wavelets:
- an elementary information of an object "sound" can correspond to an intensity; in this case, the elementary information has an absolute position corresponding to a given instant and, in the case of a multichannel sound, for a given channel.
- an elementary information of an "image" object can correspond to a pixel; in this case, the elementary information has an absolute position corresponding to a position in the image and, in the case of a video image, at a given moment.
- An elementary information of a "simulation data" object can correspond to a state; in this case, the elementary information has an absolute position corresponding to a mesh node and at a given moment.
- an elementary information of a "modulated signal” object may correspond to an intensity and / or a phase; in this case, the elementary information has an absolute position corresponding to a given instant and, possibly, to a given frequency and / or to a given position if several antennas or transmitters are used.
- Elementary information is a piece of information to be processed, represented by one or more numerical values. This information can be encoded according to various types of coding such as 8-bit coding, 10-bit coding or signed 16-bit coding. In the case where the object is an image, for example, the basic information will be the pixels of this image.
- the objects can be raw images ("raw” type) before demosaicing, in which case:
- an elementary information is a pixel represented by a corresponding numerical value, depending on the absolute position of the pixel, for example red, green or blue; in another variant, an elementary information is a group; pixels (for example a group of 2 * 2 pixels green, red, blue, green corresponding to a "Bayer") represented by a numerical value per pixel.
- the objects can also be visible images, in which case an elementary information is a pixel represented, for example, by three numerical values, each representing a color, for example red, green and blue.
- the objects can also be sequences of images, in particular raw or visible, in which case an elementary information is a pixel of an image of the image sequence.
- the objects therefore correspond, for example, to videos.
- the image may be derived from an image capture apparatus and / or intended for an image reproduction apparatus:
- An image capture device is, for example, a disposable camera, a digital camera, a DSLR (digital or not), a scanner, a fax, an endoscope, a camera, a camcorder, a camera of surveillance, a toy, a camera or a camera integrated or connected to a telephone, a personal assistant or computer, a thermal camera, an ultrasound machine, an MRI (magnetic resonance) imaging device, an x-ray machine.
- An image rendering apparatus is, for example, a screen, a projector, a television set, virtual reality glasses, or a printer.
- a device for capturing and restoring images is, for example, a scanner / fax / printer, a mini lab for printing photos, a video conference device.
- the processing platform can take various forms depending on the application. By way of example, in the case where the object is an image, mention will be made in particular of the case where the processing platform is integrated with one of the following devices:
- An image capture apparatus that produces processed images, for example a digital camera that integrates a processing platform.
- An image rendering apparatus which displays or prints processed images, for example a video projector or a printer including a processing platform.
- a mixed device that corrects the defects of its elements for example a scanner / printer / fax including a processing platform.
- a professional image capture apparatus that produces processed images, for example an endoscope including a processing platform.
- the processing platform may be deported in whole or in part to a server.
- the processing that will be applied to the object in the platform corresponds to an algorithm, described by one or more sequences of generic operations, which can intervene in various domains such as, for example, image processing, compression and decompression data processing, sound processing, signal modulation and demodulation, measurement, data analysis, indexing or searching a database, computer vision, processing graph, simulation or any field that uses a large amount of data.
- Generic operations are operations that apply to logical blocks, that is abstract entities, without any notion of size, shape, or moment.
- Generic operations can produce logical blocks.
- at least one logic block corresponds to the object to be processed.
- the method further comprises the step of determining a way of browsing the basic information in the platform according to the architecture of this platform and according to the topology of the object and independently of the first data, the determination of this mode of travel including the choice and / or calculation:
- the routing mode was determined according to the architecture of the platform and according to the topology of the object, and independently of the first data.
- the browse mode is compatible with any one of the following generic operation sequences:
- the processing of each sub-object is distributed over the Q processors, which each perform at least one specific operation IS8 of the specific sequence of operations.
- the processors are used for each sub-object and the same processor is used for all the sub-objects. It is therefore not necessary to assign the sub-objects to the processors.
- the same specific operation is also performed by the same processor for the processing of all other sub-objects.
- the treatment is thus regular: the specific operations are assigned to the processors and performed periodically thereafter for each sub-object subprocessing.
- all the loops required for processing depend on the topology of the object and the platform, but are independent of the sequence of specific operations.
- the loops are nested within each other around the complete specific operation sequence.
- the loops encapsulate the whole of the specific sequence of operations and the specific sequence of operations is not sub-sequenced, each surrounded by loops.
- the sub-objects are composed of contiguous elementary information. It is thus possible to implement processor chaining comprising at least one queue.
- each of the specific operations of the sequence is performed N times in total and N / Q times by each of the Q processors.
- each processor performs the part of the sequence taking into account these conditional branches.
- the sub-objects have no overlap in at least one dimension. So that at least one specific operation result produced when processing a sub-object is used when processing another sub-object.
- the sub-objects have no overlap in any dimension. So that we can use each processor 100% without repeating calculation.
- the sequence of specific operations is such that at least one specific operation k of the sequence occurs at least once during its N applications, a result used for the processing of another sub-object.
- the queue is shared between all the circular chaining of the same dimension.
- there is exactly one queue per dimension of the sub-object and each queue is shared between all the circular chaining in the same dimension. So the communication between the processors is particularly simple. So that the organization of memory is particularly simple.
- periodically starting a subprocessing does not necessarily mean at an exactly regular interval, the synchronization of the data necessary for the calculations and the access to the memory that can vary the period.
- N is not multiple of Q.
- Q is equal to the number of specific operations of the sequence obtained by translating the generic sequence of operations.
- N is a multiple of Q. This makes the processing regular.
- N Q. This makes it possible to reduce the amount of memory required for storing temporary results.
- Q> 1 and N Q. This makes it possible to use the Q units of calculation of a vector processor at 100%.
- Q> 1 and N is a multiple of Q. This makes it possible to use the Q units of computation of a vectorial processor at 100%, reducing the number of result of specific operations carried out during the processing of a sub-object and used for processing another sub-object.
- the memory location in which the data are stored depends, for example, on when it will be reused.
- the data can be stored in registers, in a fast memory, or in a slow memory.
- determining the browse mode includes the step of determining, which specific operation (s) of the specific operation sequence is (are) performed by each processor, each of operations being applied at least N times on each sub-object, each specific operation involving, directly and / or indirectly, at least once each of the elementary information of a sub-object during its N applications and producing exactly N results .
- the method comprises the step of adding to the specific formatted data the information thus determined.
- each processor performs all operations of the sequence of specific operations. As the sub-processes are started periodically, preferably the specific operations performed by a given processor are the same for each period.
- determining the browse mode includes the step of determining, at each relative time of the period, which specific operation (s) of the specific operation sequence is (are) performed ( s) by each processor, each of the operations being applied at least N times on each sub-object, each specific operation involving, directly and / or indirectly, at least once each of the basic information of a sub-object during its N applications and producing exactly N results.
- the method comprises the step of adding to the specific formatted data the information thus determined.
- all the processors perform the same specific operation at the same time.
- all the processors perform the same specific operation successively, which makes it possible to perform recursive filters.
- the sequence of specific operations is such that at least two distinct specific operations of the sequence each produce at least one Gd during their N applications, a result used for the processing of another subset. object.
- the method further includes the step of computing the specific formatted data according to the specific operations and the determined course mode.
- the logic blocks are of several types according to the generic operation, for example in the case where the object is an image, at least one logical block is of "raw" type, and / or
- At least one logical block is of red type, and / or
- At least one logical block is of green type, and / or
- At least one logic block is of blue type, and / or
- At least one logic block can be represented by 8-bit data, and / or at least one logic block can be represented by 16-bit data.
- At least one logic block can be represented by data on n bits, for example 10 or 12.
- At least one logic block contains multi-scale data, for example 1, H, H and 1/8 scale data. This makes it possible to carry out generic operations at several scales, and then to combine the results.
- An algorithm can, for example, without the list being limiting, correspond in the case where the object is an image with:
- a treatment in particular for improving the color rendition, and / or a treatment, in particular for improving the rendering of the contrast, and / or
- a treatment in particular for improving the rendering of the details, and / or
- the object to be processed is generally, during the processing, broken down into sub-objects, that is to say into groups of elementary information having a determined size and shape, both in space and in time, or other dimensions of the object.
- Sub-objects are sets of basic information with a shape and size that, depending on the case, depend on the characteristics of the platform, including the size and type of memory and, in the case of a vector processor , the size of a vector, but also the characteristics of the object to be treated.
- Figures la to Id Different types of decomposition into possible sub-objects, without overlap are illustrated by Figures la to Id.
- the same image can be cut into lines (lines 90, 91, 92 and 93 in Figure la ), in columns (columns 94, 95, 96 and 97 in FIG. 1b), in sub-objects of a completely different shape (forms 70, 71, 72 and 73 in FIG. 1c), or in rectangles (forms 60, 61, 62, 63, 64, 65, 66 and 67 in Figure 1d).
- the sub-objects are without overlap, it is necessary to access elementary information of at least one other sub-object to process the elementary information of a sub-object without losing an edge, for example when calculating filters.
- the sub-object decomposition may also depend on the second data, in particular the cumulative relative displacement in order, for example, to determine the necessary recovery according to one dimension.
- the image is subdivided into sub-objects having a recovery. non-zero according to at least one dimension.
- FIGS. 1a and 1e This configuration is illustrated in FIGS. 1a and 1e: FIG. 1c represents a sub-object composed of 6 ⁇ 6 elementary information in the case where the sequence of operations loses one pixel on each edge, and FIG represents an object comprising 100 pieces of information. elementary.
- the sub-objects are four rectangles 80, 82, 83 and 84 each containing 36 elementary information.
- the rectangle 80 consists of the 36 elementary information located at the top left in the image, and the rectangle 82 is made up of the 36 elementary information at the top right of the image.
- the 8 elementary information 86 is common to both sub-objects 80 and 82.
- the 8 elementary information 85 is common to both subobjects 80 and 83; the 8 elementary information items 88 are common to the two sub-objects 82 and 84, and the 8 elementary information items 89 are common to the two sub-objects 83 and 84 Finally, the 4 elementary information items 87 are common to the four sub-objects 80, 82, 83 and 84.
- the object is an image
- the image is decomposed into rectangular sub-objects juxtaposed, the sub-objects being intended to be processed, for example, from left to right and then from top to bottom.
- sub-objects are chosen and stored in one of the following ways, without the list being exhaustive:
- the size of the sub-objects is chosen to be able to perform the processing of a sub-object without access to the slow memory; For example, sub-objects corresponding to squares of 32x32 pixels may be taken, the result of the calculation on the preceding sub-object being transferred to slow memory during the calculation relating to the current sub-object, and during the transfer of the slow memory to the fast memory of the data necessary for calculation relating to the following sub-object,
- the size of the sub-objects is chosen to be able to process a sub-object using the cache memory as much as possible; for example, sub-objects corresponding to squares of 32x32 pixels or sub-objects of 1 pixel or sub-objects of 4 pixels (2 * 2) or Nl * 2 pixels may be taken, in particular in the case a raw image, - in the case of a vector processor, the size of the sub-objects is chosen as equal to, or multiple of, the size of a vector that the platform knows how to process and store we can, for example, take sub-objects corresponding to 64 horizontal pixels.
- the decomposition into sub-object can be adapted in a manner similar to the platform.
- processor-specific operations can be assigned.
- a specific operation can be performed by none, one or more processors.
- This choice depends, in particular on the architecture of the platform, namely the type of processor and the arrangement of the different processors.
- This architecture also depends on the transit of the data, namely the elementary information and / or results of specific operations from one processor to another. In this case, if we call T the time between two successive starts of two sub-processes, at a time t + k * T, where t is any instant and k any integer, the platform performs on at least one sub-processing.
- object j the same operations as those performed at time t on at least one sub-object i, these specific operations applying to elementary information and / or results of operations with the same relative position in their respective sub-object (s).
- T is the time that elapses between two successive starts of two subprocesses, but this value is not necessarily equal to the time necessary for the complete execution of an underprocessing. Indeed, one can start an under-treatment before the previous one is finished, which can, for example, save time.
- FIG. 2 This case is illustrated by FIG. 2 on which it can be seen that the subprocessing ST1 is not completed at the moment when the subprocessing ST2 begins. Similarly, the ST2 subprocessing is still executing when the ST3 subprocessing starts.
- the allocation of processor-specific operations and / or the timing of each specific operation is determined by the platform compiler from the specific formatted data.
- the method comprises the step, in the case where the number of specific operations to be applied to each sub-object is not a multiple of the number of processors Q and / or the number of basic information of the object to be treated is not a multiple of N, to add specific operations without effect and / or elementary information null, so that the number of specific operations is a multiple of Q and that the number of elementary information is a multiple of N.
- elementary information that is zero can be understood to mean elementary information of any unused content, and / or elementary information obtained by replication of other elementary information and / or elementary information obtained by calculation.
- the sequence of generic operations applies to at least one set of basic information, called logic block
- the generic formatted data further comprises second data, for generic operations involving at least two logical blocks, relating to the relative position and / or a relative scale, in particular spatial or temporal, logical blocks relative to each other, and in which the elementary information and / or results of specific operations to which each operation must apply the second data and in which, in the case where at least one relative position of the logical blocks with respect to each other is non-zero, at least one specific operation involves, directly or indirectly, at least one elementary information of another sub-object.
- the position and relative scale between any two logic blocks can be calculated using the second data. We can deduce the size of each corresponding physical block, as well as its scale and the absolute position of each element of the physical block.
- the generic operations comprise at least one generic position operation which makes it possible to obtain a logical block consisting of the absolute position along one dimension of the object, as well as a generic indirect operation, which makes it possible to obtain, from a first block, a second block by displacement and / or scaling according to a third block or a parameter.
- the generic operations comprise at least one elementary generic operation included in the group comprising: the addition of logic blocks and / or parameters, the subtraction of logic blocks and / or parameters, the calculation of the absolute value the difference between logical blocks, the multiplication of logical blocks and / or parameters, the maximum among at least two logical blocks and / or parameters, the minimum of at least two logical blocks and / or parameters grouping and unbundling of blocks logic, the calculation of a logic block by applying a parameter, corresponding to a correspondence table, to a logic block, the conditional choice of a logical block among at least two logical blocks and / or parameters, this choice is doing the following: if a> b we choose c, otherwise we choose d, with a, b, c, and d which are logical blocks and / or parameters, the histogram of a logical block, the change d scale of a logic block according to a parameter and / or a logic block, the relative displacement of a logic block according to
- Generic operations implementing a logical block and a parameter, such as addition can be translated into processing in the platform, and correspond, for example when the generic operation is an addition, to adding each element or elementary information of the processed physical block, corresponding to the logic block, with the value of the parameter corresponding to the absolute position of the element or elementary information processed.
- the generic operations include complex generic operations corresponding to groups of generic elementary operations used as such.
- these groups there may be mentioned in particular: the calculation of the median value of at least three logical blocks and / or parameters, which corresponds to a group of generic operations consisting of calculations of minimum and maximum, the multiplication / accumulation of logical blocks and / or parameters, the convolution of a logical block with a parameter, which corresponds to a group of generic operations consisting of multiplications and additions with several relative positions, the combined addition with a maximum and a minimum , the calculation of a gradient, which corresponds to an absolute value of differences with two relative positions, the scalar product of a parameter consisting of a vector and several logical blocks to produce a logical block, the calculation of a change Interpolated scale scale that corresponds to a group of generic operations consisting of scale and multiplication changes and additions with several relative positions.
- Relative positions and relative scales can correspond to various concepts depending on the nature of the object. They apply between any 2 blocks, whatever their type (in the case of an image as described above, a logic block can be in particular raw, red, green, 8 bits ).
- the absolute or relative position may correspond in a realization with 2 values (vertical and horizontal) and the absolute or relative scale with 2 values (vertical and horizontal); the pixels of the top line of an object can have as absolute positions (0; 0) (0; 1) (0; 2) ..., and the pixels of the n th line can have as absolute positions
- the relative positions can be coded in the following way: (-1; 0) indicates at the top, (0; 1) indicates at the right, (0; 0) indicates at the same place (zero relative position) and ( 2; -2) indicates 2 pixels below and 2 on the left; a relative scale of (0.5; 0.5) corresponds to a resolution of half in each direction. More generally, a combination of relative displacement and relative scale can be encoded using 2 functions f and g as follows: (f (x; y); g (x; y))) for each pixel of absolute position x, y. It should be noted that a rounding rule is necessary in order to take, for example, the nearest pixel. So :
- the absolute or relative position and the absolute or relative scale may each correspond to two values ( vertical and horizontal); the pixels in the top line of an object can have as absolute positions (0; 0)
- the pixels of the n th line can have as absolute positions (n; 0,5) (n; 1,5) (n; 2,5) if the line is odd, and (n; 0) (n; 1) (n; 2) if the line is even; the relative position can correspond to 2 values (vertical and horizontal), for example (-0.5, 0.5) indicates at the top right,
- a relative scale of (0.5; 0.5) corresponds to a resolution of half in each direction.
- a combination of relative displacement and relative scale can be encoded using 2 functions f and g as follows: (f (x; y); g (x; y))) for each pixel absolute position x, y. It should be noted that a rounding rule is necessary in order to take, for example, the nearest pixel.
- the absolute or relative position may correspond to 3 values (vertical, horizontal and temporal), for example (-
- a combination of relative displacement and relative scale can be encoded using 3 functions f, g, h as follows: (f (x; y; t); g (x; y; t )); h (x; y; t)) for each pixel of absolute position x, y at time t. It should be noted that a rounding rule is necessary in order to take, for example, the nearest pixel. - in the case where the object is a single-channel sound, the absolute or relative position may correspond to 1 value
- the absolute or relative position may correspond to 2 values
- (Time, channel), for example (-1, 0) indicates the previous instant of the same channel, and (2.1) indicates 2 instants after the next channel, ordered for example spatially in a circular manner.
- a combination of relative displacement and relative scale can be encoded using 2 functions f, g as follows: (f (t; c); g (t; c)) for each sound sample position at time t for channel c. It should be noted that a rounding rule is necessary to take, for example, the instant and the nearest channel.
- the absolute or relative position may correspond to n values each corresponding to a spatial or temporal dimension depending on the topology of the mesh.
- a combination of relative displacement and relative scale can be encoded using n functions. It should be noted that a rounding rule is necessary in order to take, for example, the node and the nearest instant.
- the absolute or relative position may correspond to n values respectively corresponding to the time, if any, to the frequency channel (transmission or reception over several frequencies) and if necessary (several transmitters or spatially arranged receivers) to a spatial dimension.
- a combination of relative displacement and relative scale can be encoded using n functions, and a rounding rule should be chosen.
- the absolute or relative position may correspond to n values each corresponding to a dimension of the object which, depending on the case, may be of a temporal, spatial, frequency, phase or other.
- a combination of relative displacement and relative scale can be coded using n functions and a rounding rule should be chosen.
- the absolute or relative position may correspond to n values each corresponding to a dimension of the object which, depending on the case, may be temporal, spatial, frequency, phase Or other.
- n values each corresponding to a dimension of the object which, depending on the case, may be temporal, spatial, frequency, phase Or other.
- a combination of relative displacement and relative scale can be coded using n functions and a rounding rule should be chosen.
- the method includes the step of determining, based on the second data, a portion of the results of specific operations required for subsequent specific operations for another subprocessing.
- the method further comprises the step of grouping in memory the results of specific operations required for subsequent specific operations for another subprocessing based on the second data and / or the browse mode.
- the specific operations are performed by compute units chained according to at least one circular chaining
- the method further comprises the step of determining according to the second data for each specific operation, whether or not not transmit the results of said specific operation according to a circular chaining.
- the method further comprises, if necessary, the step of determining, as a function of the second data and the travel mode, which circular chaining to use to transmit the results of said specific operation.
- the relative displacement information and / or relative scale of the second data is used to determine the dimension (s) according to which there is a displacement and / or a change of scale for a generic operation. given. It is thus possible to determine the circular chaining (s) to be implemented for each specific operation of the specific operation sequence translated from the sequence of generic operations.
- said circular chaining further comprises at least one file.
- the specific formatted data includes information about the grouping of specific operations, which group consists of forming packets of one or more specific operations to be executed without holding the results of each specific operation not useful for a particular operation. other subprocessing. In one embodiment, all the specific operations of the sequence are grouped together.
- the specific formatted data has only one set of nested loops.
- certain specific operations may apply to results of operations previously used in another subprocessing. For example, when the sequence of specific operations contains a filter involving three lines resulting from intermediate calculations, in the case of an image broken down into sub-objects corresponding to lines: the filter operation applied to the first sub-object , ie in the first line, also uses, for example, the second and third lines of the image. These second and third lines will also be used by this same filter operation when it will apply to the second sub-object, and even to the third sub-object in the third line. In this case, it may be interesting to keep in memory these lines of pixels so as not to have to recalculate them later, which is expensive in calculations.
- the specific formatted data includes operations to store at least a portion of the results of specific operations required for subsequent specific operations for another subprocessing in memory of the platform.
- the method comprises the step of grouping in memory the results of specific operations used in the subprocessing of another sub-object as a function of the relative position of said other sub-object with respect to said subobject.
- the browse mode is such that N is 5, the object is decomposed into 10 * 5 sub-objects, the 10 horizontally arranged sub-objects are processed one after the other, then the sub-objects located below are treated, And so on.
- the queue used according to the horizontal dimension contains data from the previous iteration whereas the queue used vertically contains data from the previous 10 iterations; the grouping in memory of the results of specific operations therefore depends on the mode of travel.
- the grouping in memory of the results of specific operations used during the sub-processing of another sub-object is therefore a function of the relative position of said other sub-object with respect to said sub-object.
- the method also includes, in some cases, the step of grouping in at least one queue the results of specific operations performed during the sub-processing of a sub-object and used in the sub-processing of another sub-processing. object.
- the chaining also comprises at least one queue.
- the specific formatted data includes information about the transfer of the results of specific operations and / or basic information in memory from one memory location of the platform to another.
- the fifth example of translation described below contains such transfers.
- At least one specific operation may, for its part, be such as to have an edge effect, that is to say that there is a loss of certain information lying on the edge of the sub- objects on which these operations are applied.
- the image is subdivided into sub-objects that have overlap, that is, they have some common elementary information.
- the specific formatted data includes specific operations for specific operations results are calculated several times in the platform, so as not to lose any information at the time of execution of the specific operations, especially in the case where the sub-objects have a recovery in at least one dimension.
- the fifth translation example described below contains such a recovery.
- the specific formatted data contains addressing information to allow the platform to access at least a portion of the results of specific operations and / or elementary information in memory, which addressing information is in the form of "Base address + offset” or "(base address + offset) modulo (buffer size)", the offset being constant for the results from the same specific operation of a sub-processing at the other.
- the base address is changed for each subprocessing.
- the buffer memory is integrated in one of the memories of the processing platform.
- the buffer memory may in particular be a queue.
- the method further comprises the step of determining said offset based on the first data, the offset being different for each specific operation of the sequence of specific operations obtained by translation of the sequence of generic operations of the first data.
- at least one queue is implemented with addressing information of the form "base address + offset", or "(base address + offset” modulo (the size of a buffer located in the platform): the second data is used to determine the queue to use.
- the processing comprises the calculation of at least one loop, the number of iterations of the loop or loops and, when there are several loops, the nesting of the loops, depending on the mode of travel.
- the translation examples below show that it is possible to automatically calculate the loops according to the platform, unlike the known languages where the loops are manually coded according to the platform.
- loops can, for example, be used to browse the sub-objects, especially in the case where the object to be treated is a rectangular sub-object cut-out image, and where one chooses to traverse them either horizontally or vertically. .
- specific formatted data in some cases includes temporary variables required for processing.
- Some specific operations use parameters, and in this case, the values of these parameters are also processed. These parameters may for example be multiplying coefficients. These parameters may correspond, for example, without the list being limited to:
- the use of a parameter dependent on the absolute position of the elementary information in the object to be treated, for a vignetting correction makes it possible to obtain a stronger edge compensation in order to compensate for an optical defect.
- the use of a parameter depending on the absolute position of the elementary information in the object to be processed for a de-masking ("demosaicing" in English) makes it possible to treat red pixels, green pixels and blue pixels differently. a raw image from a sensor.
- second data in particular a displacement, depending on the absolute position of the elementary information in the object to be processed for a numerical zooming calculation (“zoom") or a distortion correction makes it possible to obtain the pixels needed to calculate the interpolation at each point.
- zoom numerical zooming calculation
- distortion correction makes it possible to obtain the pixels needed to calculate the interpolation at each point.
- the value of a parameter can: be constant and intrinsic to the algorithm; in this case the parameter value may, in particular, be transmitted to the processing means or to the platform, and / or
- the value of the parameter may depend on the type of optical that has an impact on the level of blur in the image; in this case the parameter value may, in particular, be transmitted to the processing means or to the platform, and / or
- the value of the parameter may depend on the gain of the sensor actually used to capture said object which has an impact on the noise level in the image; in this case, the parameter value may, in particular, be transmitted, chosen or calculated by the platform, and / or
- parameter can, in particular, be transmitted, chosen or calculated by the platform, and / or - not depend on the absolute position of the elementary information in the object.
- the parameter value can be determined simultaneously or a posteriori with respect to the definition of the algorithm.
- the value of some parameters can vary from one object to another, from one sub-object to another or from one elementary information to another.
- the value of the parameter is calculated at each change.
- the possible values of the parameter are calculated a priori, and, at each change, the index or the address making it possible to access the value of the parameter, for example in a table, is determined.
- a limited number of sets of parameter values are determined, each set is stored and for each sub-object the game to be used is selected, for example by calculating a function of the position giving the address of the set. game to use.
- the parameters are used when applying certain specific operations.
- the specific formatted data includes the value (s) of the parameters and / or a code for calculating the value (s) of the parameters.
- the parameter value (s) is chosen according to the processing platform, so that the processing takes into account the characteristics of this platform. So, we can have of an identical algorithm for several platforms, and this algorithm is adapted to each desired platform, only by varying these characteristics.
- these parameter values depend on the object to be processed.
- the specific operations comprise at least one specific calculation operation taken in the group comprising: addition, subtraction, multiplication, application of a correspondence table, the minimum, the maximum , the selection
- At least one specific calculation operation also produces an offset, and / or a saturation and / or a rounding.
- the specific selection calculation operation makes it possible to choose one of at least two data as a function of the value of a third data item.
- the application of a correspondence table is performed by a calculation that implement the entry of the table and a limited number of coefficients. In one embodiment, the limited number of coefficients is set to 8.
- the specific operations are performed by computation units chained by means of at least one circular chaining CCI; said circular chaining CCI further comprising at least one file; at least one specific operation IS4 of the specific operation sequence transmitting the result of a specific operation IS5 performed on a calculation unit UCl to the calculation unit UC2 or file which follows said calculation unit UCl according to said chaining.
- the specific operation IS4 transmits, from the queue, to the UCO calculation unit that follows the queue, the result of a specific operation IS5 performed during a previous subprocessing.
- the queue is preferably used to output the data in the same order as it was entered in the queue.
- the sub-object comprises DSO dimensions
- the specific operations are performed by computation units chained to a given dimension DD of the sub-object by means of at least one circular chaining CCI
- said circular chaining CCI further comprises at least one file
- the method further comprises the step, for at least one specific operation, for each application of said specific operation of transmitting the result of said application of the specific operation performed on a calculation unit UCl to the calculation unit UC2 or file that follows said UCl calculation unit according to said chaining to transmit from the queue to the processing unit UCO following the queue, conditionally according to the position of the sub-object in the object, a result of the application of the specific operation transmitted to the queue when processing another sub-object.
- the specific operations comprise at least one specific geometric operation taken from the group comprising: transposition, replication, downsampling
- the specific operations can be performed by computation units chained according to at least one circular chaining; said circular chaining further comprising at least one file.
- the specific operations are performed by computation units chained according to at least one circular chaining; said circular chaining further comprising at least one file; the queue having a size and / or latency; the method further comprising the step of determining the size and / or latency of the queue according to the generic operation sequence and the browse mode.
- the queue comprises several data streams, and makes it possible to store for each stream an identical number NF of data. NF is determined according to the relative disposition of the sub-objects and the browse mode, so that NF-I sub-objects are processed between the processing of a sub-object producing data and the processing of the sub-object using the data.
- the objects to be processed are images
- the elementary information is the pixels of this image.
- the processing platform is part of an image capture and / or rendering apparatus, and the values of the parameters are related to the characteristics of the optics and / or the sensor and / or the image.
- imager and / or electronics and / or software of the capture and / or image recovery apparatus may be, for example, characteristics intrinsic fixed for all objects or variables according to the object, for example noise characteristics that vary according to the gain of a sensor.
- the characteristics may be identical for all the elementary or variable information depending on the absolute position of the elementary information, for example the fuzziness characteristics of the optics.
- the object to be processed is a digitized sound signal, and in this case, the elementary information is the sound samples of this signal.
- the relative positions present in the second data will generally be time positions. However, it may happen that these positions are spatial, especially in the case where the object to be treated is a sound present on several channels.
- the specific formatted data described here can be provided directly to a processing platform. But they can also be provided, in a known computer language, such as C or VHDL, to a compiler for translating this computer language for the platform. This allows, for example, to use an existing platform with a compiler, without having to deal with the allocation of registers or time sequencing instructions ("scheduling English").
- the invention also relates to a system for processing generic formatted data, comprising first data describing a sequence of generic operations that do not include loops, the system providing, directly or indirectly, specific formatted data, for a given platform comprising Q processor (s) and at least one memory, the platform being intended to process, according to the specific formatted data, an object constituted by elementary information of the same nature, each elementary information being represented by at least one digital value, the system comprising:
- the system comprises: means for determining a mode of browsing the basic information in the platform according to the architecture of this platform and according to the topology of the object and independently of the first data, these means for determining this mode of travel including means for choosing and / or calculating:
- a grouping of elementary information into sub-objects each comprising a number N of elementary information, a multiple of Q, determined according to the platform, the processing in the platform of periodically starting an under-processing, which consists in apply the sequence of specific operations on one of the sub-objects, the shape and overlap of the sub-objects, determined according to the platform, the order of processing of the sub-objects, determined according to the platform, -, and
- the system includes means for determining which specific operations of the sequence of specific operations are performed by each processor, each of the specific operations being applied at least N times on each sub-node. object, each specific operation involving, directly and / or indirectly, at least once each of the elementary information of a sub-object during its N applications and producing exactly N results; the method further comprising the step of adding to the specific formatted data the information so determined.
- the system includes means for determining, at each relative time of the period, which specific operations of the sequence of specific operations are performed by each processor, each of the specific operations being applied at least N times on each sub-object, each specific operation involving, directly and / or indirectly, at least once each of the elementary information of a sub-object during its N applications and producing exactly N results; the method further comprising the step of adding to the specific formatted data the information so determined.
- the system includes means for, in the case where the number of operations specific to apply on each sub-object is not a multiple of the number of processors Q and / or the number of basic information of the object to be treated is not a multiple of N, add specific operations without effect and / or zero elementary information, so that the number of specific operations is a multiple of Q and the number of elementary information is a multiple of N.
- the system comprises: means for the sequence of generic operations to apply to at least one set of elementary information called logical block, means for receiving generic formatted data comprising second data, for generic operations involving at least two logical blocks, relating to the relative position and / or the relative scale, in particular spatial or temporal, logical blocks relative to each other, means for the elementary information and / or results of specific operations to which each specific operation is to apply depend on the second data and the means for, in the case where at least one relative position of the logical blocks with respect to each other is non-zero, that at least one operation specific involves directly and / or indirectly, at least one elementary information of another sub-object.
- the system includes means for determining, based on the second data, a portion of the results of specific operations required for subsequent specific operations for another subprocessing.
- the system includes a memory in which the results of operations are grouped together. specific requirements for subsequent specific operations for another subprocessing based on the second data and / or the mode of travel.
- the system comprises computation units chained according to at least one circular chaining, and means for determining, as a function of the second data for each specific operation, whether or not to transmit the results of said specific operation in accordance with a circular chaining.
- the system also comprises means for, if necessary, determining, according to the second data and the mode of travel, the circular chaining to be used to transmit the results of said specific operation.
- the system includes means for the specific formatted data to include information about the grouping of specific operations, the grouping consisting of the formation of packets of one or more specific operations to be executed without holding the results of each operation. specific for another sub-treatment.
- the system includes means for the specific formatted data to include operations to store at least a portion of the results of specific operations required for subsequent specific operations for another subprocessing in memory of the platform.
- the system includes a memory in which are grouped the results of specific operations used in the sub-processing of another sub-object, as a function of the relative position of said other sub-object with respect to said sub-object .
- the system comprises at least one queue in which are grouped the results of specific operations performed during the sub-processing of a sub-object and used during the subprocessing of another sub-object.
- the system includes means for the specific formatted data to include specific operations so that specific operation results are computed multiple times in the platform, so as not to lose any information at the time of execution of the operations. specific, especially in the case where the sub-objects have an overlap in at least one dimension of the object.
- the system includes means for the specific formatted data to contain addressing information enabling the platform to access at least a portion of the results of specific operations and / or elementary information in memory, such information being addressing being in the form of "base address + offset” or “base address + offset modulo (the size of a buffer located in the platform), the offset being constant for the results from the same specific operation.
- the system includes means for changing the base address for each subprocessing.
- the system includes means for calculating the offset as a function of the order of the specific operations, so as to provide the platform with addresses of empty memory locations or containing a specific operation result or elementary information which is no longer used, in order to store results of specific operations.
- the system comprises means for calculating at least one loop, the number of iterations of the loop (s) and, when there are several loops, the nesting of the loops, depending on the mode of the loop. course.
- the system includes means for further processing at least one parameter, such that the value (s) of the parameter (s) used by the specific operations depend (s). the position in the sub-objects of the elementary information involved, directly or indirectly, in these specific operations.
- the system includes means for further processing at least one parameter, such that the specific formatted data includes the value (s) of the parameter (s) and / or a calculation code of the the value (s) of the parameter (s).
- the system comprises means for processing, in addition, at least one parameter, and comprising means for selecting the parameter value (s) as a function of the processing platform, so that the treatment takes into account the characteristics of this platform.
- the system comprises means for further processing at least one parameter, such that the value (s) of the parameter (s) depends (s) of the object to be treated.
- the system comprises means for performing a specific operation included in the group comprising: addition, subtraction, multiplication, application of a correspondence table, minimum, maximum, selection.
- the system comprises calculation units chained by means of at least one circular chaining CCI, said circular chaining CCI further comprising at least one queue; the system comprising means for transmitting the result of a specific operation IS5 performed on a calculation unit UCIs, to a calculation unit UC2 or file which follows said calculation unit UCl according to said chaining.
- the system includes means for performing at least one geometric specific operation included in the group consisting of: transposition, replication, and subsampling.
- the system comprises computation units chained according to at least one circular chaining, the circular chaining also comprising at least one queue.
- the system comprises computation units chained according to at least one circular chaining, the circular chaining further comprising at least one queue, and the system comprising means for determining a size and / or a latency of the queue according to the sequence of generic operations and the mode of travel.
- the system comprises means for the object to be treated to be an image and for the elementary information to be pixels of this image.
- the system comprises means for the processing platform to be part of an image capture and / or rendering apparatus, and for the value (s) of the parameter (s) ) related to the characteristics of the optics and / or sensor and / or the imager and / or the electronics and / or software of the image capture and / or rendering apparatus.
- the system comprises means for the object to be processed to be a digitized sound signal and for the elementary information to be sound samples of this signal.
- the system comprises means for the object to be treated to be a digital mesh and for the elementary information of the spatial and temporal information characterizing each point of the mesh.
- FIG. 2, already described, represents an example of sequencing of several sub-treatments
- FIG. 3 represents a device using a method according to the invention
- FIG. 4 represents an example of a sequence of generic operations applied to several
- FIG. 5 represents the structure of specific formatted data provided by a method according to the invention
- FIGS. 6, 7 and 8 show different platform architectures that can receive specific formatted data provided by a compliant process. to the invention.
- Figures 9a, 9b and 9c show examples of chaining processors in a platform that can receive specific formatted data provided by a method according to the invention.
- the device shown in FIG. 3 is used to process an image 22, this image being a set of pixels represented by at least one numerical value.
- generic data formatted data 12 is provided to a digital data processing means 12.
- This processing means may for example be a compiler.
- the generic formatted data includes first and second data that describe generic operation sequences and that give the relative positions of the logical blocks involved in these generic operations. These first and second data will be illustrated by the description of FIG.
- the processing means 10 also receives as input a run mode 24 chosen or calculated according to the characteristics of a processing platform 20, such as an image capture or restitution apparatus. From these generic formatted data 12 and these parameters, the processing means 10 provides the processing platform 20 with specific formatted data 18.
- Specific formatted data contains different types of data, such as data about the organization of pixels in the platform's memory, the order in which the pixels are processed by the platform, or the grouping of specific operations performed by the platform. .
- the platform 20 then uses these specific formatted data 18 to process the image 22 it receives as input.
- FIG. 3 thus illustrates several advantages of the invention: the generic formatted data 12 can be quickly modified or replaced and translated into specific formatted data optimized for the platform. This reduces the time to market the platform.
- the generic formatted data 12 can be rapidly translated into specific formatted data optimized for several platforms. This also reduces the time to market of multiple platforms.
- Table 4 below and FIG. 4 show an example of generic formatted data, in the form of a sequence of generic operations applied to a logic block B1.
- This sequence has three generic operations.
- the columns of the array represent, in order: the rank of the operation in the sequence, the name of the generic operation, - the logical block (output) on which the result of the generic operation is written, c ' that is to say the location where this result would be if we reconstituted the object at the end of each operation, the first input (entry 1) of the generic operation, which can be a logical block or a parameter, the relative position of the logic block to be used with respect to the input logic block 1, if any, the second input (input 2) of the generic operation, which may also be a logic block or a parameter, and the relative position of the logic block to be used with respect to the input logic block 2, if any.
- the information in the "relative position” columns is the information present in the second data provided to a processing means by a method according to the invention.
- this information is in the form "left” and “right” to be understandable, but in fact, in generic formatted data, it can also be encoded by numeric values such as (0; 1) and / or by functions such that f (x; y), as described in the above exemplary embodiments.
- a generic operation makes it possible to obtain a logical block consisting of the absolute position according to one dimension of the object
- another generic operation known as indirection makes it possible to obtain a block by displacement and / or scaling. indicated by a second block from a third block.
- Table 4 is only an example of coding, the first data and second data can be encoded in various ways in tabular form, but also in symbolic form, in graphic form or in any other form. In addition, additional information about data types, offsets, and saturations are not shown for simplification of the example.
- the first logic block used in this sequence of operations is a logic block B1 (51).
- the first operation generic is an addition (52) between the logic block B1 shifted left (51g), and the logic block Bc shifted right (5Id).
- the second operation (54) is a transformation of the block B2 (53) with respect to a table. This operation therefore has the block B2 (53) and a Param1 parameter (55) which represents the modification table.
- the third and last operation (57) of this sequence is a multiplication of logical blocks.
- the logic block B4 (58) is thus the block obtained at the end of the sequence of generic operations.
- the generic formatted data in the example in Table 4 are independent of the platform, the decomposition of the object into subobjects, the way in which the elementary information of the object is browsed, and the order in which the basic information will be processed in the platform, as well as the organization in memory. Indeed, the generic formatted data of table 1 can be translated in various ways into specific formatted data or into code for the platform, for example, without the list being limiting, according to the following translations.
- a first example of a translation although not optimal in terms of memory and computation time, makes it possible to illustrate a simple translation without going through a decomposition into sub-objects:
- a second example of translation shows that the size of the memory used can be decreased without changing the generic formatted data. Indeed, in the first example, 4 physical blocks of size close to the image are used. Only 2 physical blocks can be used using the same memory for BP2, BP3 and BP4. We obtain the following translation:
- a third example of translation shows that one can reduce the computation time without changing the generic formatted data. Indeed, in the second example, we use 2 physical blocks of size close to the image, but we write 3 times entirely the physical block BP2, one reads 2 times entirely the physical block BPl and one reads 2 times entirely the physical block BP2. We can limit our to only reading and writing with a different course mode and different blocks. This reduces the number of necessary operations, but also the accesses to the memory. We obtain the following translation:
- a fifth example of translation is particularly suitable for a signal processing processor having a small fast memory and a large slow memory, each sub-object being a rectangle, for example 32x32, or any other value that maximizes the use of the fast memory, the rectangles being joined.
- the sub-objects are traversed from left to right then from top to bottom: Initiate a transfer by a DMA mechanism
- a sixth example of a translation is particularly suitable for a vector processor capable of applying the same calculation to the various pixels of the vector, each sub-object being a rectangle, for example 64 horizontal pixels, or any other value equal to the size of a vector. vector that the platform knows how to process and store. This translation requires no memory because only one vector is processed at a time. We thus obtain the following translation:
- each line create a vector VO containing, on the right, the 2 left pixels of the line Extract of VO and Vl, the vector V2 corresponding to the two pixels of right of VO and the pixels of left of Vl, excluding the 2 right pixels of VO;
- V2 Add V1 and V2 to obtain V2, apply the table to each pixel of V2 to obtain V2, extract V0 and V1, the vector V3 corresponding to the right pixel of V0 and the left pixels of V1 excluding the pixel from the right of VO; copy Vl into VO for the next iteration; multiply V2 by V3 to obtain V2, store the result V2 in the current output physical block.
- the third, fourth, fifth, and sixth examples are examples of translating the sequence of generic operations into a sequence of specific operations. For simplicity, the examples produce a smaller image than the input image. It is easy, if necessary, to obtain an output image of the same size as the input image by adding code at the beginning and end of each line to duplicate the edge pixel.
- FIG. 5 represents the structure of the formatted data specific to the output of a processing means using a method according to the invention.
- the specific formatted data is computed by processing means from generic formatted data provided by the processing means and a browse mode determined by the processing means.
- the generic formatted data includes first data 36 containing data 38 describing at least one generic operation or sequence of operations to be performed by the processing means.
- the generic formatted data also includes second data 40 relating to the relative position and scale of logical blocks relative to each other, for generic operations involving at least two logical blocks. From these generic formatted data and the browse mode 34, the processing means provides data 42 relating to the specific operations, and data 44 relating to the loops. These data 42 and 44 are part of the specific formatted data 30.
- the processing platform comprises five processors chained in one dimension. This means that the result of the outgoing calculations Proc A processor is used at the input of the ProcB processor, and so on. The result of outbound calculations from the ProcE processor is applied to the input of the ProcA processor.
- Each of the processors has a memory of limited capacity, noted MemA to MemE
- This memory unit is intended to store the values of parameters useful to the specific operations performed by the processor, or elementary information or results of operations that are intended to be reused quickly by the processor.
- the processing consists in applying to the elementary information comprising the object a sequence of eight operations denoted OP1 to 0P8.
- each operation is assigned to a processor.
- the processor A realizes OP1 and 0P6,
- the processor B realizes 0P2 and 0P7
- the processor C realizes 0P3 and 0P8,
- the processor D realizes 0P4 and 0P9, and
- the processor E realizes 0P5 and OP10.
- Each processor executes a set of instructions (InsA to InsE) corresponding to the specific operations that have been assigned to it. This assignment also depends on the values parameters stored in the limited capacity memories. For example, if OP1 is a multiplication by 2, the memory MemA will contain the number 2.
- Each line represents one of the 10 specific operations OP1 to OP10.
- Each column represents one of the elementary information IE1 to IE5 composing each of the sub-objects to be processed.
- This notation IE1 to IE5 is formal; it does not necessarily correspond to a spatial or temporal reality. Indeed, certain specific operations have the effect of moving the elementary information.
- the information IE1 processed by the specific operation 0P2 may not be the result of the specific operation OP1 applied to the information IE1, but the result of this specific operation OP1 applied to the information IE2, for example if the specific operation OP1 consists of a shift to the left.
- Each box in this table contains the name of the processor that performs the specific operation, as well as the time when that specific operation is performed during processing. Of course, this table represents only part of the treatment. It is assumed here that all the results of specific operations required have been calculated beforehand in the processing.
- ProcA performs the operation OP1 on the first information IE1 of the sub-object 1. At this moment, the other processors are carrying out other operations not represented on this table.
- each of the processors performs an operation on one of the information of the sub-object 1.
- a processor has performed a specific operation on all the elementary information of a sub-object, it switches to following operation among those assigned to it.
- the processor ProcA performs, from T6, the operation 0P6.
- This sequencing is obtained by the one-dimensional circular chaining of the processors.
- the elementary information can therefore transit from one calculation unit to another.
- the elementary information IE1 passes through all the processors to "undergo" the specific operations OP1 to 0P5, then it goes back to the processor ProcA to restart a cycle and "undergo” the operations 0P6 to 0P7.
- the elementary information IE1 of departure will not necessarily IEl information at all stages). It can thus be seen that the invention makes it possible to generate the data specific formatting adapted to a systolic architecture which has the advantage, in particular, to store the parameter values locally and to be able to wire the data paths.
- the exact sequencing can be performed at least partially by a compiler of the platform. In this case, the specific formatted data does not contain the absolute time sequencing, but rather constraints on the sequencing.
- the platform contains five processors connected to a common memory.
- Such a structure is traditional: it corresponds to that of a vector processor (of the "Single Instruction Multiple Data" or SIMD) type.
- each processor is individually connected to a small memory that can contain parameters such as a correspondence table T.
- each processor performs all the specific operations.
- all processors receive the same set of INS instructions.
- one of the operations consists of using a table to modify one or more elementary information items.
- each of the processors has access to its own table, all the tables being identical.
- each memory is shared by a group of processors.
- all the processors share the same memory and simultaneously obtain the same parameter; in this case, the application of a correspondence table must be performed by calculation using one or more parameters to for example calculate a polynomial.
- the platform comprises a vector processor composed of five processors connected to a common memory, similar to the vector processor notably present in a personal computer (PC). They are also all connected to a small memory that can contain parameters, and in particular a correspondence table. In this structure, each processor performs all the specific operations. Thus, all processors receive the same set of INS instructions with data describing all the specific operations to be performed.
- PC personal computer
- the operation 0P4 is performed by the processors ProcA to ProcE respectively at times T4 to T8. If it is assumed that the operation 0P5 also uses a table, we will have the same way: the operation 0P5 is performed by the processors ProcA to ProcE respectively at times T9 to T13.
- sequencing can be performed at least partially by a compiler of the platform.
- specific formatted data do not contain the absolute sequencing over time, but rather constraints on the sequencing.
- FIG. 9a shows an exemplary embodiment of a platform, comprising several circular chaining according to one dimension of the sub-object.
- the object is a two-dimensional image
- the sub-object has 4 basic information
- the platform has 4 processors arranged in a grid of 4 * 1 processors corresponding to a rectangle of 4 processors horizontally and 1 processor vertically.
- the processors are called from left to right: Pl, P2, P3, and P4.
- the method also implements in this example 2 queues: a horizontal queue FHa is connected as an input to an output of P4 and as an output to an input of the processor P1.
- An output of Pl is connected to an input of P2.
- An output of P2 is connected to an input of P3, and an output of P3 is connected to an input of P4.
- a vertical queue FVa is connected at the input to an output of P1, P2, P3 and P4 and at the output to an input of the processor P1, P2, P3 and P4.
- the sequence of specific operations can implement an arbitrary number of FH horizontal filters while using the 4 processors at 100%.
- a specific operation 0S2 performing the calculation of a filter consisting of an addition between the result of a specific operation OS1 and the result of the same specific operation OS1 on the left: the result of the OS1 operation of the processor P4 is put in the queue FHa and will be used by 0S2 on Pl when calculating a next sub-object; the result of the operation OS1 of the processor P3 is transferred to the processor P4 to be used by 0S2 on P4 in combination with the result of OS1 on P4, the result of the operation OS1 of the processor P2 is transferred to the processor P3 for use by O2 on P3 in combination with the result of OS1 on P3; the result of the operation OS1 of the processor P1 is transferred to the processor P2 for use by O2 on P2 in combination with the result of OS1 on P2; the result of the OS1 operation performed by P4 during a calculation of a previous
- sequence of specific operations can implement an arbitrary number of FV vertical filters while using the 4 processors at 100%,
- sequence of specific operations can implement an arbitrary number of non-separable filters according to the 2 horizontal and vertical dimensions FVH while using the 4 processors at 100%; for example a 3x3 non-separable filter applied on 4 results of a specific operation 0S4, can twice request FVa then six times FHa, to obtain the 8 sets of 4 previously calculated results of 0S4 to be combined with the result set of 0S4 of current sub-object; for example, these non-separable filters can be used in combination with vertical and / or horizontal filters, the 2 files make it possible to recover the data in the right order.
- FIG. 9b shows a second example, in which the object is a two-dimensional image, the sub-object comprises 4 elementary information, and the platform comprises 4 processors arranged according to a grid of corresponding 2 * 2 processors. to a rectangle of 2 processors horizontally and 2 processors vertically. The processors are called from left to right: P4 and P5 on the top line and P6 and P7 on the bottom line.
- the method also implements in this example 2 queues: - a horizontal queue FHb is connected as an input to the output of P3 and P6 and as an output to the input of Pl and P4 a vertical queue FVb is connected as input to a output of P4 and P5 and output to an input of the processor P6 and P7.
- a horizontal queue FHb is connected as an input to the output of P3 and P6 and as an output to the input of Pl and P4
- a vertical queue FVb is connected as input to a output of P4 and P5 and output to an input of the processor P6 and P7.
- the platform comprises a single processor P8, connected to a horizontal queue FHc and to a vertical queue FVc. These two files can be used by the processor to store results of specific operations for later use.
- the sequence of specific operations can implement an arbitrary number of vertical and / or horizontal and / or non-separable filters while using the 100% processor.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Image Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Multi Processors (AREA)
- Communication Control (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
Description
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP06847179A EP1963971A2 (fr) | 2005-12-19 | 2006-12-19 | Procede et systeme de traitement de donnes numeriques |
JP2008545067A JP6085405B2 (ja) | 2005-12-19 | 2006-12-19 | ディジタルデータ処理のための方法およびシステム |
KR1020087017730A KR101391465B1 (ko) | 2005-12-19 | 2006-12-19 | 디지털 데이터 처리 방법 및 시스템 |
US12/097,893 US8429625B2 (en) | 2005-12-19 | 2006-12-19 | Digital data processing method and system |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR0553946 | 2005-12-19 | ||
FR0553946A FR2895103B1 (fr) | 2005-12-19 | 2005-12-19 | Procede et systeme de traitement de donnees numeriques |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2007071883A2 true WO2007071883A2 (fr) | 2007-06-28 |
WO2007071883A3 WO2007071883A3 (fr) | 2007-08-16 |
Family
ID=37430519
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/FR2006/051389 WO2007071883A2 (fr) | 2005-12-19 | 2006-12-19 | Procede et systeme de traitement de donnes numeriques |
Country Status (7)
Country | Link |
---|---|
US (1) | US8429625B2 (fr) |
EP (1) | EP1963971A2 (fr) |
JP (1) | JP6085405B2 (fr) |
KR (1) | KR101391465B1 (fr) |
CN (1) | CN101379468A (fr) |
FR (1) | FR2895103B1 (fr) |
WO (1) | WO2007071883A2 (fr) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105516033A (zh) * | 2015-07-23 | 2016-04-20 | 中国电子科技集团公司第四十一研究所 | 一种基于频谱分析仪的模拟信号解调与分析方法 |
WO2016171846A1 (fr) * | 2015-04-23 | 2016-10-27 | Google Inc. | Compilateur pour effectuer une traduction entre une architecture d'ensemble d'instructions (isa) de processeur d'image virtuelle et un matériel cible ayant une structure de réseau à décalage bidimensionnelle |
US9749548B2 (en) | 2015-01-22 | 2017-08-29 | Google Inc. | Virtual linebuffers for image signal processors |
US9756268B2 (en) | 2015-04-23 | 2017-09-05 | Google Inc. | Line buffer unit for image processor |
US9769356B2 (en) | 2015-04-23 | 2017-09-19 | Google Inc. | Two dimensional shift array for image processor |
US9772852B2 (en) | 2015-04-23 | 2017-09-26 | Google Inc. | Energy efficient processor core architecture for image processor |
US9830150B2 (en) | 2015-12-04 | 2017-11-28 | Google Llc | Multi-functional execution lane for image processor |
US9965824B2 (en) | 2015-04-23 | 2018-05-08 | Google Llc | Architecture for high performance, power efficient, programmable image processing |
US9978116B2 (en) | 2016-07-01 | 2018-05-22 | Google Llc | Core processes for block operations on an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
US9986187B2 (en) | 2016-07-01 | 2018-05-29 | Google Llc | Block operations for an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
US10095479B2 (en) | 2015-04-23 | 2018-10-09 | Google Llc | Virtual image processor instruction set architecture (ISA) and memory model and exemplary target hardware having a two-dimensional shift array structure |
US10204396B2 (en) | 2016-02-26 | 2019-02-12 | Google Llc | Compiler managed memory for image processor |
US10284744B2 (en) | 2015-04-23 | 2019-05-07 | Google Llc | Sheet generator for image processor |
US10313641B2 (en) | 2015-12-04 | 2019-06-04 | Google Llc | Shift register with reduced wiring complexity |
US10380969B2 (en) | 2016-02-28 | 2019-08-13 | Google Llc | Macro I/O unit for image processor |
US10387988B2 (en) | 2016-02-26 | 2019-08-20 | Google Llc | Compiler techniques for mapping program code to a high performance, power efficient, programmable image processing hardware platform |
US10546211B2 (en) | 2016-07-01 | 2020-01-28 | Google Llc | Convolutional neural network on programmable two dimensional image processor |
US10915773B2 (en) | 2016-07-01 | 2021-02-09 | Google Llc | Statistics operations on two dimensional image processor |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1316427C (zh) * | 2001-07-12 | 2007-05-16 | 杜莱布斯公司 | 产生与装置链的装置的缺陷相关的格式化信息的方法和系统 |
US9037961B1 (en) * | 2006-09-18 | 2015-05-19 | Credit Suisse Securities (Usa) Llc | System and method for storing a series of calculations as a function for implementation in a spreadsheet application |
US8515052B2 (en) | 2007-12-17 | 2013-08-20 | Wai Wu | Parallel signal processing system and method |
US8755515B1 (en) | 2008-09-29 | 2014-06-17 | Wai Wu | Parallel signal processing system and method |
US20110289417A1 (en) * | 2010-05-21 | 2011-11-24 | Schaefer Diane E | User interface for configuring and managing the cluster |
US9134960B2 (en) * | 2010-10-29 | 2015-09-15 | International Business Machines Corporation | Numerical graphical flow diagram conversion and comparison |
CN102867002A (zh) * | 2011-07-05 | 2013-01-09 | 北大方正集团有限公司 | 电子文件的处理方法和装置 |
CN103785173A (zh) * | 2014-03-06 | 2014-05-14 | 苏州运智互动科技有限公司 | Android系统双体感外设数据区分获取方法 |
US10255547B2 (en) * | 2014-12-04 | 2019-04-09 | Nvidia Corporation | Indirectly accessing sample data to perform multi-convolution operations in a parallel processing system |
KR101763827B1 (ko) * | 2016-04-07 | 2017-08-02 | 주식회사 라이프시맨틱스 | 블록체인 기반 의료데이터전송시스템, 방법 및 프로그램 |
US10467142B1 (en) * | 2019-05-07 | 2019-11-05 | 12 Sigma Technologies | Enhancement of real-time response to request for detached data analytics |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1999052040A1 (fr) * | 1998-04-08 | 1999-10-14 | Stellar Technologies, Ltd. | Architecture pour traitement graphique |
Family Cites Families (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
ATE121208T1 (de) * | 1990-01-30 | 1995-04-15 | Johnson Service Co | Vernetztes betriebsmittelverwaltungssystem. |
US5655130A (en) * | 1994-10-14 | 1997-08-05 | Unisys Corporation | Method and apparatus for document production using a common document database |
US5860072A (en) * | 1996-07-11 | 1999-01-12 | Tandem Computers Incorporated | Method and apparatus for transporting interface definition language-defined data structures between heterogeneous systems |
US5857100A (en) * | 1996-09-03 | 1999-01-05 | Insession Inc. | System, method and article of manufacture for extending externalization for universal transaction processing |
US6014702A (en) * | 1997-06-04 | 2000-01-11 | International Business Machines Corporation | Host information access via distributed programmed objects |
US7039919B1 (en) * | 1998-10-02 | 2006-05-02 | Microsoft Corporation | Tools and techniques for instrumenting interfaces of units of a software program |
US6988271B2 (en) * | 1998-10-02 | 2006-01-17 | Microsoft Corporation | Heavyweight and lightweight instrumentation |
US6463582B1 (en) * | 1998-10-21 | 2002-10-08 | Fujitsu Limited | Dynamic optimizing object code translator for architecture emulation and dynamic optimizing object code translation method |
GB9825102D0 (en) * | 1998-11-16 | 1999-01-13 | Insignia Solutions Plc | Computer system |
GB0107882D0 (en) * | 2001-03-29 | 2001-05-23 | Ibm | Parsing messages with multiple data formats |
DE60231752D1 (de) | 2001-07-12 | 2009-05-07 | Do Labs | Verfahren und system zur umsetzung eines bildes aus einem digitalen bild |
CN1316427C (zh) | 2001-07-12 | 2007-05-16 | 杜莱布斯公司 | 产生与装置链的装置的缺陷相关的格式化信息的方法和系统 |
FR2827459B1 (fr) | 2001-07-12 | 2004-10-29 | Poseidon | Procede et systeme pour fournir a des logiciels de traitement d'image des informations formatees liees aux caracteristiques des appareils de capture d'image et/ou des moyens de restitution d'image |
US7191404B2 (en) * | 2002-01-14 | 2007-03-13 | International Business Machines Corporation | System and method for mapping management objects to console neutral user interface |
US7565660B2 (en) * | 2002-09-26 | 2009-07-21 | Siemens Energy & Automation, Inc. | System and method for universal extensibility that supports a plurality of programmable logic controllers |
JP4487479B2 (ja) * | 2002-11-12 | 2010-06-23 | 日本電気株式会社 | Simd命令シーケンス生成方法および装置ならびにsimd命令シーケンス生成用プログラム |
WO2004072796A2 (fr) * | 2003-02-05 | 2004-08-26 | Arizona Board Of Regents | Traitement reconfigurable |
US20040187090A1 (en) * | 2003-03-21 | 2004-09-23 | Meacham Randal P. | Method and system for creating interactive software |
US7987455B1 (en) * | 2003-07-23 | 2011-07-26 | International Business Machines Corporation | System and method of command processing |
US7647580B2 (en) * | 2004-09-07 | 2010-01-12 | Microsoft Corporation | General programming language support for nullable types |
US7694288B2 (en) * | 2005-10-24 | 2010-04-06 | Analog Devices, Inc. | Static single assignment form pattern matcher |
US8117587B1 (en) * | 2008-06-03 | 2012-02-14 | Richard Paul Testardi | Microcontroller-resident software development environment supporting application-level asynchronous event handling, interactive debugging and pin variables for embedded systems |
-
2005
- 2005-12-19 FR FR0553946A patent/FR2895103B1/fr not_active Expired - Fee Related
-
2006
- 2006-12-19 CN CNA2006800530321A patent/CN101379468A/zh active Pending
- 2006-12-19 EP EP06847179A patent/EP1963971A2/fr not_active Ceased
- 2006-12-19 WO PCT/FR2006/051389 patent/WO2007071883A2/fr active Application Filing
- 2006-12-19 JP JP2008545067A patent/JP6085405B2/ja not_active Expired - Fee Related
- 2006-12-19 US US12/097,893 patent/US8429625B2/en active Active
- 2006-12-19 KR KR1020087017730A patent/KR101391465B1/ko active IP Right Grant
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1999052040A1 (fr) * | 1998-04-08 | 1999-10-14 | Stellar Technologies, Ltd. | Architecture pour traitement graphique |
Non-Patent Citations (8)
Title |
---|
BACON D F ET AL: "Compiler transformations for high-performance computing" ACM COMPUTING SURVEYS, NEW YORK, NY, US, vol. 26, no. 4, décembre 1994 (1994-12), pages 345-420, XP002246513 ISSN: 0360-0300 * |
BENKNER S ET AL: "Processing array statements and procedure interfaces in the PREPARE High Performance Fortran compiler" COMPILER CONSTRUCTION. 5TH INTERNATIONAL CONFERENCE, CC'94. PROCEEDINGS SPRINGER-VERLAG BERLIN, GERMANY, 1994, pages 324-338, XP001248203 ISBN: 3-540-57877-3 * |
C. AIGLON, CH. LAVARENNE, Y. SOREL AND A. VICARD: "Utilisation de SynDEx pour le traitement d'images temps-réel" RAPPORT DE RECHERCHE INRIA, no. 2968, septembre 1996 (1996-09), pages 1-79, XP002409017 INRIA Rocquencourt * |
DARTE A ET AL: "Generalized multipartitioning of multi-dimensional arrays for parallelizing line-sweep computations" JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, ELSEVIER, AMSTERDAM, NL, vol. 63, no. 9, septembre 2003 (2003-09), pages 887-911, XP004459910 ISSN: 0743-7315 * |
GRANDPIERRE T ET AL: "From algorithm and architecture specifications to automatic generation of distributed real-time executives:a seamless flow of graphs transformations" FORMAL METHODS AND MODELS FOR CO-DESIGN, 2003. MEMOCODE '03. PROCEEDINGS. FIRST ACM AND IEEE INTERNATIONAL CONFERENCE ON 24-26 JUNE 2003, PISCATAWAY, NJ, USA,IEEE, 24 juin 2003 (2003-06-24), pages 123-132, XP010643808 ISBN: 0-7695-1923-7 * |
KAHN G ED - ROSENFELD J L INTERNATIONAL FEDERATION FOR INFORMATION PROCESSING: "THE SEMANTICS OF A SIMPLE LANGUAGE FOR PARALLEL PROGRAMMING" INFORMATION PROCESSING. STOCKHOLM, AUGUST 5-10, 1974, PROCEEDINGS OF IFIP CONGRESS, AMSTERDAM, NORTH-HOLLAND, NL, vol. PROC. 1974, 1974, pages 471-475, XP001179529 * |
RAITLET M ET AL: "Automatic coarse-grain partitioning and automatic code generation for heterogeneous architectures" SIGNAL PROCESSING SYSTEMS, 2003. SIPS 2003. IEEE WORKSHOP ON 27 - 29 AUG. 2003, PISCATAWAY, NJ, USA,IEEE, 27 août 2003 (2003-08-27), pages 316-321, XP010661035 ISBN: 0-7803-7795-8 * |
Z. CHAMSKI , A. COHEN, M. DURANTON, C. EISENBEIS, P. FEAUTRIER, D. GENIUS, L. PASQUIER, V. RIVIERRE-VIER, F. THOMASSET AND Q. ZHAO: "The SANDRA project: cooperative architecture/compiler technology for embedded real-time streaming applications" RAPPORT DE RECHERCHE INRIA, no. 4773, mars 2003 (2003-03), pages 1-13, XP002380285 INRIA Rocquencourt * |
Cited By (57)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10277833B2 (en) | 2015-01-22 | 2019-04-30 | Google Llc | Virtual linebuffers for image signal processors |
US10791284B2 (en) | 2015-01-22 | 2020-09-29 | Google Llc | Virtual linebuffers for image signal processors |
US9749548B2 (en) | 2015-01-22 | 2017-08-29 | Google Inc. | Virtual linebuffers for image signal processors |
US10516833B2 (en) | 2015-01-22 | 2019-12-24 | Google Llc | Virtual linebuffers for image signal processors |
US10754654B2 (en) | 2015-04-23 | 2020-08-25 | Google Llc | Energy efficient processor core architecture for image processor |
US11138013B2 (en) | 2015-04-23 | 2021-10-05 | Google Llc | Energy efficient processor core architecture for image processor |
US9785423B2 (en) | 2015-04-23 | 2017-10-10 | Google Inc. | Compiler for translating between a virtual image processor instruction set architecture (ISA) and target hardware having a two-dimensional shift array structure |
US11190718B2 (en) | 2015-04-23 | 2021-11-30 | Google Llc | Line buffer unit for image processor |
CN107430760A (zh) * | 2015-04-23 | 2017-12-01 | 谷歌公司 | 用于图像处理器的二维移位阵列 |
GB2554204A (en) * | 2015-04-23 | 2018-03-28 | Google Inc | Compiler for translating between a virtual image processor instruction set architecture (ISA) and target hardware |
US9965824B2 (en) | 2015-04-23 | 2018-05-08 | Google Llc | Architecture for high performance, power efficient, programmable image processing |
US10417732B2 (en) | 2015-04-23 | 2019-09-17 | Google Llc | Architecture for high performance, power efficient, programmable image processing |
US11153464B2 (en) | 2015-04-23 | 2021-10-19 | Google Llc | Two dimensional shift array for image processor |
US10095492B2 (en) | 2015-04-23 | 2018-10-09 | Google Llc | Compiler for translating between a virtual image processor instruction set architecture (ISA) and target hardware having a two-dimensional shift array structure |
US10095479B2 (en) | 2015-04-23 | 2018-10-09 | Google Llc | Virtual image processor instruction set architecture (ISA) and memory model and exemplary target hardware having a two-dimensional shift array structure |
US11140293B2 (en) | 2015-04-23 | 2021-10-05 | Google Llc | Sheet generator for image processor |
GB2554204B (en) * | 2015-04-23 | 2021-08-25 | Google Llc | Compiler for translating between virtual image processor instruction set architecture (ISA) and target hardware having a two-dimensional shift array structure |
US10216487B2 (en) | 2015-04-23 | 2019-02-26 | Google Llc | Virtual image processor instruction set architecture (ISA) and memory model and exemplary target hardware having a two-dimensional shift array structure |
US9769356B2 (en) | 2015-04-23 | 2017-09-19 | Google Inc. | Two dimensional shift array for image processor |
US10275253B2 (en) | 2015-04-23 | 2019-04-30 | Google Llc | Energy efficient processor core architecture for image processor |
US10284744B2 (en) | 2015-04-23 | 2019-05-07 | Google Llc | Sheet generator for image processor |
US10291813B2 (en) | 2015-04-23 | 2019-05-14 | Google Llc | Sheet generator for image processor |
CN110149802B (zh) * | 2015-04-23 | 2021-04-16 | 谷歌有限责任公司 | 用于在虚拟图像处理器指令集架构(isa)与具有二维移位阵列结构的目标硬件之间进行转译的编译器 |
CN107430760B (zh) * | 2015-04-23 | 2021-01-12 | 谷歌有限责任公司 | 用于图像处理器的二维移位阵列 |
US10321077B2 (en) | 2015-04-23 | 2019-06-11 | Google Llc | Line buffer unit for image processor |
WO2016171846A1 (fr) * | 2015-04-23 | 2016-10-27 | Google Inc. | Compilateur pour effectuer une traduction entre une architecture d'ensemble d'instructions (isa) de processeur d'image virtuelle et un matériel cible ayant une structure de réseau à décalage bidimensionnelle |
US10719905B2 (en) | 2015-04-23 | 2020-07-21 | Google Llc | Architecture for high performance, power efficient, programmable image processing |
US10638073B2 (en) | 2015-04-23 | 2020-04-28 | Google Llc | Line buffer unit for image processor |
US10397450B2 (en) | 2015-04-23 | 2019-08-27 | Google Llc | Two dimensional shift array for image processor |
CN110149802A (zh) * | 2015-04-23 | 2019-08-20 | 谷歌有限责任公司 | 用于在虚拟图像处理器指令集架构(isa)与具有二维移位阵列结构的目标硬件之间进行转译的编译器 |
US10599407B2 (en) | 2015-04-23 | 2020-03-24 | Google Llc | Compiler for translating between a virtual image processor instruction set architecture (ISA) and target hardware having a two-dimensional shift array structure |
US11182138B2 (en) | 2015-04-23 | 2021-11-23 | Google Llc | Compiler for translating between a virtual image processor instruction set architecture (ISA) and target hardware having a two-dimensional shift array structure |
US9772852B2 (en) | 2015-04-23 | 2017-09-26 | Google Inc. | Energy efficient processor core architecture for image processor |
US10560598B2 (en) | 2015-04-23 | 2020-02-11 | Google Llc | Sheet generator for image processor |
US9756268B2 (en) | 2015-04-23 | 2017-09-05 | Google Inc. | Line buffer unit for image processor |
CN105516033A (zh) * | 2015-07-23 | 2016-04-20 | 中国电子科技集团公司第四十一研究所 | 一种基于频谱分析仪的模拟信号解调与分析方法 |
US10313641B2 (en) | 2015-12-04 | 2019-06-04 | Google Llc | Shift register with reduced wiring complexity |
US9830150B2 (en) | 2015-12-04 | 2017-11-28 | Google Llc | Multi-functional execution lane for image processor |
US10185560B2 (en) | 2015-12-04 | 2019-01-22 | Google Llc | Multi-functional execution lane for image processor |
US10477164B2 (en) | 2015-12-04 | 2019-11-12 | Google Llc | Shift register with reduced wiring complexity |
US10998070B2 (en) | 2015-12-04 | 2021-05-04 | Google Llc | Shift register with reduced wiring complexity |
US10204396B2 (en) | 2016-02-26 | 2019-02-12 | Google Llc | Compiler managed memory for image processor |
US10387989B2 (en) | 2016-02-26 | 2019-08-20 | Google Llc | Compiler techniques for mapping program code to a high performance, power efficient, programmable image processing hardware platform |
US10387988B2 (en) | 2016-02-26 | 2019-08-20 | Google Llc | Compiler techniques for mapping program code to a high performance, power efficient, programmable image processing hardware platform |
US10304156B2 (en) | 2016-02-26 | 2019-05-28 | Google Llc | Compiler managed memory for image processor |
US10685422B2 (en) | 2016-02-26 | 2020-06-16 | Google Llc | Compiler managed memory for image processor |
US10380969B2 (en) | 2016-02-28 | 2019-08-13 | Google Llc | Macro I/O unit for image processor |
US10733956B2 (en) | 2016-02-28 | 2020-08-04 | Google Llc | Macro I/O unit for image processor |
US10504480B2 (en) | 2016-02-28 | 2019-12-10 | Google Llc | Macro I/O unit for image processor |
US11196953B2 (en) | 2016-07-01 | 2021-12-07 | Google Llc | Block operations for an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
US10915773B2 (en) | 2016-07-01 | 2021-02-09 | Google Llc | Statistics operations on two dimensional image processor |
US10546211B2 (en) | 2016-07-01 | 2020-01-28 | Google Llc | Convolutional neural network on programmable two dimensional image processor |
US10531030B2 (en) | 2016-07-01 | 2020-01-07 | Google Llc | Block operations for an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
US9986187B2 (en) | 2016-07-01 | 2018-05-29 | Google Llc | Block operations for an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
US9978116B2 (en) | 2016-07-01 | 2018-05-22 | Google Llc | Core processes for block operations on an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
US10334194B2 (en) | 2016-07-01 | 2019-06-25 | Google Llc | Block operations for an image processor having a two-dimensional execution lane array and a two-dimensional shift register |
US10789505B2 (en) | 2016-07-01 | 2020-09-29 | Google Llc | Convolutional neural network on programmable two dimensional image processor |
Also Published As
Publication number | Publication date |
---|---|
FR2895103A1 (fr) | 2007-06-22 |
JP6085405B2 (ja) | 2017-02-22 |
KR20080087840A (ko) | 2008-10-01 |
EP1963971A2 (fr) | 2008-09-03 |
CN101379468A (zh) | 2009-03-04 |
FR2895103B1 (fr) | 2008-02-22 |
US8429625B2 (en) | 2013-04-23 |
JP2009524854A (ja) | 2009-07-02 |
US20090228677A1 (en) | 2009-09-10 |
WO2007071883A3 (fr) | 2007-08-16 |
KR101391465B1 (ko) | 2014-05-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2007071883A2 (fr) | Procede et systeme de traitement de donnes numeriques | |
EP1964053A2 (fr) | Procede pour traiter un objet dans une plateforme a processeur(s) et memoire(s) et plateforme utilisant le procede | |
Tancik et al. | Nerfstudio: A modular framework for neural radiance field development | |
Yue et al. | Supervised raw video denoising with a benchmark dataset on dynamic scenes | |
CN111194458B (zh) | 用于处理图像的图像信号处理器 | |
Henz et al. | Deep joint design of color filter arrays and demosaicing | |
EP0369854B1 (fr) | Procédé et circuit de traitement par bloc de signal bidimensionnel d'images animées | |
Nguyen et al. | Raw image reconstruction using a self-contained srgb–jpeg image with small memory overhead | |
FR3091375A1 (fr) | Instruction de chargement-stockage | |
WO2010037570A1 (fr) | Dispositif de traitement en parallele d'un flux de donnees | |
WO2007071882A2 (fr) | Procede pour fournir des donnees a un moyen de traitement numerique | |
Cho et al. | Single‐shot High Dynamic Range Imaging Using Coded Electronic Shutter | |
Liu et al. | Tape: Task-agnostic prior embedding for image restoration | |
Mazumdar et al. | A hardware-friendly bilateral solver for real-time virtual reality video | |
Vera et al. | Shuffled rolling shutter for snapshot temporal imaging | |
Ma et al. | Searching for fast demosaicking algorithms | |
Verma et al. | Deep demosaicing using resnet-bottleneck architecture | |
Hu et al. | Single image reflection separation via component synergy | |
Hajisharif et al. | Single sensor compressive light field video camera | |
CN113298740A (zh) | 一种图像增强方法、装置、终端设备及存储介质 | |
FR2823050A1 (fr) | Dispositif implementant conjointement un post-traitement et un decodage de donnees | |
Hajisharif | Computational Photography: High Dynamic Range and Light Fields | |
FR2560700A1 (fr) | Appareil de traitement pour synthese de signaux en pyramide hierarchique en temps reel | |
Buggy et al. | Neural net architectures for image demosaicing | |
CN115103118B (zh) | 高动态范围图像生成方法、装置、设备及可读存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) | ||
REEP | Request for entry into the european phase |
Ref document number: 2006847179 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2006847179 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2008545067 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2596/KOLNP/2008 Country of ref document: IN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020087017730 Country of ref document: KR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 200680053032.1 Country of ref document: CN |
|
WWP | Wipo information: published in national office |
Ref document number: 2006847179 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 12097893 Country of ref document: US |