CN102411773B - Vector-processor-oriented mean-residual normalized product correlation vectoring method - Google Patents
Vector-processor-oriented mean-residual normalized product correlation vectoring method Download PDFInfo
- Publication number
- CN102411773B CN102411773B CN2011102133381A CN201110213338A CN102411773B CN 102411773 B CN102411773 B CN 102411773B CN 2011102133381 A CN2011102133381 A CN 2011102133381A CN 201110213338 A CN201110213338 A CN 201110213338A CN 102411773 B CN102411773 B CN 102411773B
- Authority
- CN
- China
- Prior art keywords
- sigma
- pixel value
- overbar
- processor
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Abstract
The invention discloses a vector-processor-oriented mean-residual normalized product correlation vectoring method. The method comprises the following steps of: setting a reference graph A and a real-time graph B; traversing the real-time graph B and calculating a mean value of pixel values in the real-time graph B and an accumulated sum of pixel value squares Bij2 respectively; traversing the reference graph A and taking two sub graphs Auv and A(u+4)v from the reference graph A each time, and shuffling to obtain four sub graphs A(u+k)v (k=0, 1, 2 and 3); sequentially calculating the accumulated sum of the pixel values, the accumulated sum of (A(u+k)v)ij2 and the accumulated sum of (A(u+k)v)ij*Bij; sequentially calculating the mean-residual normalized product correlation coefficients of the sub graphs A(u+k)v (k=0, 1, 2 and 3) with the real-time graph B; and setting u to be u+4, repeating the steps until the reference graph A is traversed completely so as to acquire all the mean-residual normalized product correlation coefficient values.
Description
Technical field
The present invention relates to images match and vectorization thereof compiling field, refer in particular to a kind of vectorization implementation method of going average normalizing eliminate indigestion related coefficient.
Background technology
Along with the computation requirement of the compute-intensive applications such as 4G radio communication, Radar Signal Processing, HD video and Digital Image Processing is more and more higher, single-chip is difficult to satisfy application demand, and polycaryon processor vector processor especially wherein is widely used.Vector processor generally is comprised of a plurality of processor units (PE), usually supports to load and storage based on the data of vector.Each PE comprises independently a plurality of functional parts, generally comprises shifting part, ALU parts, multiplying unit etc.Vector processor is supported SIMD (single instrction/majority according to) operation usually, and namely under the control of same vector instruction, all PE carry out same operation to separately local register simultaneously, in order to the data level concurrency of developing application.
Images match is processed the many high density computing applications in using, as often need to calculate the similarity of benchmark image and realtime graphic based on the images match of template, as poor absolute value and, normalizing eliminate indigestion related coefficient (Normalized Product correlation, Nprod) etc., wherein going average normalizing eliminate indigestion related coefficient to have very strong anti-noise ability, is one of widely used similarity criterion in the images match.But the highly dense processor active task of this class need to scheme in real time with reference map in each subgraph mate one by one calculating, calculated amount is very large.On single-chip processor, usually adopt the fast algorithm that slides to calculate by row, column to reduce calculated amount.But on vector processor, this fast algorithm can not effectively be implemented.How to take full advantage of a large amount of computational resource of vector processor, the multistage parallel of exploitation vector processor improves the vector processor service efficiency, and the vectorization method is crucial efficiently.
Go average normalizing eliminate indigestion Calculation of correlation factor flow process to be, establish reference map A, its size is MxN, and figure is B in real time, and its size is mxn, and M>m, N>n; In the reference map take the subgraph of (u, v) upper left angle point as A
Uv, it can be represented by the formula with the average normalizing eliminate indigestion related coefficient of going of scheming in real time B:
(A wherein
Uv)
IjExpression subgraph A
UvThe pixel value that middle coordinate (i, j) is located, B
IjThe pixel value that coordinate (i, j) is located among the B is schemed in expression in real time.The ρ (u, v) that the above calculates is for expression subgraph A
UvWith the matching degree of real-time figure B go average normalizing eliminate indigestion related coefficient value.In order to calculate best match position, need all subgraphs in the traversal reference map, and calculate one by one subgraph and real-time figure go average normalizing eliminate indigestion related coefficient value, ask for minimum value wherein.It is inferior to need altogether to calculate (M-m) * (N-n), and goes the calculating of average normalizing eliminate indigestion related coefficient value all to relate to the operations such as a large amount of element datas is sued for peace, the element product is sued for peace and add up at every turn, and calculated amount is very large.On single-chip processor, usually adopt the fast algorithm that slides to calculate by row, column to calculate.The advantage of this fast method is the result of calculation of recycling front, avoids a large amount of double countings.But concerning vector processor, vector processor comprises a plurality of processor units on the one hand, the result of calculation difficulty of recycling front, the subgraph pixel data adopts 8 pixel values usually on the other hand, traversal need to be by the byte offset reads image data during reference map, and vector processor does not support that generally the data of striding word boundary read.What lack at present effective vector processor-oriented goes average normalizing eliminate indigestion related coefficient vectorization implementation method.
Summary of the invention
Technical matters to be solved by this invention is: for the problem that prior art exists, the invention provides a kind of principle simple, easy to operate, can efficient calculation, can improve the vectorization implementation method of going average normalizing eliminate indigestion related coefficient of the vector processor-oriented of processor computational resource service efficiency.
For solving the problems of the technologies described above, the present invention by the following technical solutions:
A kind of vectorization implementation method of going average normalizing eliminate indigestion related coefficient of vector processor-oriented may further comprise the steps:
(1) establish reference map A, its size is MxN, and figure is B in real time, and its size is mxn, and M>m, N>n; Vector processor comprises P processing unit;
(2) the vector processor data that at first travel through real-time figure B and will scheme in real time B are read in vector registor, employing is sued for peace to the value in the processing unit based on the dot product operation of SIMD, to the summation of the value between processing unit, calculate respectively the pixel value average of scheming in real time among the B based on reduction operation
With pixel value square B
Ij 2Accumulation and;
(3) vector processor traversal reference map A and at every turn get the subgraph A that 4 elements of two head interval and length are the 4*p position from reference map A
UvAnd A
(u+4) v, obtain the subgraph A that 1 element of 4 head sequence interval and length are 4*p by shuffling operation
(u+k) v(k=0,1,2,3);
(4) adopt the dot product based on SIMD to operate the summation of the value in the processing unit, to the summation of the value between processing unit, calculate successively described subgraph A based on reduction operation
(u+k) vThe pixel value average of all elements in (k=0,1,2,3)
Pixel value accumulative total and, pixel value square (A
(u+k) v)
Ij 2Accumulation and and reference map A and the real-time pixel value product (A of figure B
(u+k) v)
Ij* B
IjAccumulation and;
(5) calculate successively subgraph A
(u+k) v(k=0,1,2,3) respectively with real-time figure B remove average normalizing eliminate indigestion correlation coefficient ρ (u, v), ρ (u+1, v), ρ (u+2, v), ρ (u+3, v);
(6) make u=u+4, repeat above-mentioned steps (3) to step (6) until traveled through reference map A, can calculate reference map A with scheme in real time B all go average normalizing eliminate indigestion related coefficient value.
As a further improvement on the present invention:
Pixel value square B
Ij 2Accumulation and computing formula be:
Wherein, b
w=(B
Iw, B
I (w+1), B
I (w+2), B
I (w+3)) be 32 fixed point vectors that 48 pixel values consist of, e
w=(1,1,1,1) is 32 fixed point vectors that 4 unit picture element values consist of;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation, L is cycle count and L=mn/4p.
In the described step (4), described pixel value average:
Described pixel value accumulative total and:
Described pixel value square (A
(u+k) v)
Ij 2Accumulation and:
Described pixel value product (A
(u+k) v)
Ij* B
IjAccumulation and
A wherein
w=(A
Uv)
Iw, (A
Uv)
I (w+1), (A
Uv)
I (w+2), (A
Uv)
I (w+3)) be 32 fixed point vectors that 48 pixel values consist of, e
w=(1,1,1,1) is 32 fixed point vectors that 4 unit picture element values consist of;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation.
In the described step (5), A
(u+k) v(k=0,1,2,3) with the computing formula of going average normalizing eliminate indigestion related coefficient of scheming in real time B are:
Compared with prior art, the invention has the advantages that:
1, the vectorization implementation method of going average normalizing eliminate indigestion related coefficient of vector processor-oriented of the present invention, realize simple, with low cost, easy to operate, good reliability, can give full play to the computation capability of whole PE of vector processor, and fully excavated the data parallelism based on SIMD of vector processor, the real-time figure of each traversal can calculate 4 and go average normalizing eliminate indigestion related coefficient value, Effective Raise based on the execution efficient of image matching algorithm in vector processor of going average normalizing eliminate indigestion related coefficient.
2, adopt method of the present invention simpler than traditional vectorization method, efficient, the hardware costs that the object vector processor is realized is low, in the situation that realizes identical function, has reduced power consumption.In addition, method of the present invention realizes simple, with low cost, easy to operate, good reliability.
Description of drawings
Fig. 1 is main-process stream synoptic diagram of the present invention;
Fig. 2 is the subgraph A in the specific embodiment of the invention
UvAnd A
(u+4) vObtain the synoptic diagram of 4 adjacent subgraphs by shuffling operation;
Fig. 3 be vector processor in the specific embodiment of the invention based on the operation of the dot product of SIMD to the summation of the value in the PE, based on reduction operation to the summation of the value between PE synoptic diagram.
Embodiment
Below with reference to Figure of description and specific embodiment the present invention is described in further detail.
As shown in Figure 1, the vectorization implementation method of going average normalizing eliminate indigestion related coefficient of vector processor-oriented of the present invention may further comprise the steps:
1, establish reference map A, its size is MxN, and figure is B in real time, and its size is mxn, and M>m, N>n; Vector processor comprises P processing unit;
2, the vector processor data that at first travel through real-time figure B and will scheme in real time B are read in vector registor, employing is sued for peace to the value in the processing unit based on the dot product operation of SIMD, to the summation of the value between processing unit, calculate respectively the pixel value average of scheming in real time among the B based on reduction operation
With pixel value square B
Ij 2Accumulation and;
Pixel value square B
Ij 2Accumulation and computing formula be:
Wherein, b
w=(B
Iw, B
I (w+1), B
I (w+2), B
I (w+3)) be 32 fixed point vectors that 48 pixel values consist of, e
w=(1,1,1,1) is 32 fixed point vectors that 4 unit picture element values consist of;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation, L is cycle count and L=mn/4p.
As shown in Figure 3, (a
0, a
1, a
2, a
3) be 4 element a
0, a
1, a
2, a
3The vector that consists of, (a
i, a
I+1, a
I+2, a
I+3) be 4 element a
i, a
I+1, a
I+2, a
I+3The vector that consists of, (b
0, b
1, b
2, b
3) be 4 element b
0, b
1, b
2, b
3The vector that consists of, (b
i, b
I+1, b
I+2, b
I+3) be 4 element b
i, b
I+1, b
I+2, b
I+3The vector that consists of, two groups of vectors are stored in respectively in the different vector registors.Dot product based on SIMD in the PE operates: (a
0, a
1, a
2, a
3) and (b
0, b
1, b
2, b
3) the dot product operating result is (a
0* b
0, a
1* b
1, a
2* b
2, a
3* b
3), (a
i, a
I+1, a
I+2, a
I+3) and (b
i, b
I+1, b
I+2, b
I+3) the dot product operating result is (a
i* b
i, a
I+1* b
I+1, a
I+2* b
I+2, a
I+3* b
I+3).That reduction sum operation between PE obtains these two groups of vectors and be: a
0* b
0+ a
1* b
1+ a
2* b
2+ a
3* b
3+ ... + a
i* b
i+ a
I+1* b
I+1+ a
I+2* b
I+2+ a
I+3* b
I+3
3, vector processor traversal reference map A and at every turn get the subgraph A that 4 elements of two head interval and length are the 4*p position from reference map A
UvAnd A
(u+4) v, obtain the adjacent subgraph A that 1 element of 4 head sequence interval and length are 4*p by shuffling operation
(u+k) v(k=0,1,2,3);
As shown in Figure 2, equal 2 as example take processing unit PE quantity: processor is got the adjacent vector p1 of 4 elements in two intervals from reference map, and p2, vector length are 4 times of PE quantity of vector processor, i.e. vectorial p1, and the element number of p2 all is 8.Through shuffling adjacent vector v0, v1, v2 and the v3 that obtains 1 element in 4 intervals after the operation.
4, adopt the dot product based on SIMD to operate the summation of the value in the processing unit, to the summation of the value between processing unit, calculate successively subgraph A based on reduction operation
(u+k) vThe pixel value average of all elements in (k=0,1,2,3)
Pixel value accumulative total and, pixel value square (A
(u+k) v)
Ij 2Accumulation and and reference map A and the real-time pixel value product (A of figure B
(u+k) v)
Ij* B
IjAccumulation and;
The pixel value average:
Pixel value accumulative total and:
Pixel value square (A
(u+k) v)
Ij 2Accumulation and:
Pixel value product (A
(u+k) v)
Ij* B
IjAccumulation and
A wherein
w=(A
Uv)
Iw, (A
Uv)
I (w+1), (A
Uv)
I (w+2), (A
Uv)
I (w+3)) be 32 fixed point vectors that 48 pixel values consist of, e
w=(1,1,1,1) is 32 fixed point vectors that 4 unit picture element values consist of;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation.
5, calculate successively subgraph A
(u+k) v(k=0,1,2,3) respectively with real-time figure B remove average normalizing eliminate indigestion correlation coefficient ρ (u, v), ρ (u+1, v), ρ (u+2, v), ρ (u+3, v);
A
(u+k) v(k=0,1,2,3) with the computing formula of going average normalizing eliminate indigestion related coefficient of scheming in real time B are:
With k=0,1,2,3 successively substitutions calculate ρ (u, v), ρ (u+1, v), ρ (u+2, v), ρ (u+3, v).
Suppose that pixel value average among the real-time figure B that calculates according to each step of front is the accumulation of b0 and pixel value square and is b1,4 pixel values accumulative totals of reference map A and be respectively s0, s1, s2, s3, the accumulation of pixel value square and be respectively q0, q1, q2, q3, reference map A accumulates with 4 that scheme in real time B element product and is respectively r0, r1, r2, r3, then 4 are removed average normalizing eliminate indigestion related coefficient value ρ 0, and ρ 1, ρ 2, and ρ 3 is calculated as follows:
Calculate 4 at every turn and remove average normalizing eliminate indigestion related coefficient value ρ 0, ρ 1, and ρ 2, and ρ 3.
6, make u=u+4, repeat above-mentioned steps 3 to step 6 until traveled through reference map A, can calculate reference map A with scheme in real time B all go average normalizing eliminate indigestion related coefficient value.Can go minimum value in the average normalizing eliminate indigestion related coefficient value to determine optimum matching subgraph coordinate by asking all.
In sum, by the present invention, can support efficiently the vectorization of average normalizing eliminate indigestion related coefficient to calculate, can give full play to the computation capability of whole PE of vector processor, and fully excavated the data parallelism based on SIMD to vector processor, Effective Raise go the execution efficient of average normalizing eliminate indigestion related coefficient in vector processor.
The above only is preferred implementation of the present invention, and protection scope of the present invention also not only is confined to above-described embodiment, and all technical schemes that belongs under the thinking of the present invention all belong to protection scope of the present invention.Should be pointed out that for those skilled in the art the some improvements and modifications not breaking away under the principle of the invention prerequisite should be considered as protection scope of the present invention.
Claims (4)
1. the vectorization implementation method of going average normalizing eliminate indigestion related coefficient of a vector processor-oriented is characterized in that may further comprise the steps:
(1) establish reference map A, its size is MxN, and figure is B in real time, and its size is mxn, and M>m, N>n; Vector processor comprises P processing unit;
(2) the vector processor data that at first travel through real-time figure B and will scheme in real time B are read in vector registor, employing is sued for peace to the value in the processing unit based on the dot product operation of SIMD, to the summation of the value between processing unit, calculate respectively the pixel value average of scheming in real time among the B based on reduction operation
With pixel value square B
Ij 2Accumulation and;
(3) vector processor traversal reference map A and at every turn get the subgraph A that 4 elements of two head interval and length are the 4*p position from reference map A
UvAnd A
(u+4) v, obtain the adjacent subgraph A that 1 element of 4 head sequence interval and length are 4*p by shuffling operation
(u+k) v(k=0,1,2,3);
(4) adopt the dot product based on SIMD to operate the summation of the value in the processing unit, to the summation of the value between processing unit, calculate successively described subgraph A based on reduction operation
(u+k) vThe pixel value average of all elements in (k=0,1,2,3)
Pixel value accumulative total and, pixel value square (A
(u+k) v)
Ij 2Accumulation and and reference map A and the real-time pixel value product (A of figure B
(u+k) v)
Ij* B
IjAccumulation and;
(5) calculate successively subgraph A
(u+k) v(k=0,1,2,3) respectively with real-time figure B remove average normalizing eliminate indigestion correlation coefficient ρ (u, v), ρ (u+1, v), ρ (u+2, v), ρ (u+3, v);
(6) make u=u+4, repeat above-mentioned steps (3) to step (6) until traveled through reference map A, can calculate reference map A with scheme in real time B all go average normalizing eliminate indigestion related coefficient value.
2. the vectorization implementation method of going average normalizing eliminate indigestion related coefficient of vector processor-oriented according to claim 1 is characterized in that, in the described step (2), and described pixel value average
Computing formula be:
Pixel value square B
Ij 2Accumulation and computing formula be:
Wherein, b
w=(B
Iw, B
I (w+1), B
I (w+2), B
I (w+3)) be 32 fixed point vectors that 48 pixel values consist of, e
w=(1,1,1,1) is 32 fixed point vectors that 4 unit picture element values consist of;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation, L is cycle count and L=mn/4p.
3. the vectorization implementation method of going average normalizing eliminate indigestion related coefficient of vector processor-oriented according to claim 2 is characterized in that, in the described step (4), and described pixel value average:
Described pixel value accumulative total and:
Described pixel value square (A
(u+k) v)
Ij 2Accumulation and:
Described pixel value product (A
(u+k) v)
Ij* B
IjAccumulation and
A wherein
w=(A
Uv)
Iw, (A
Uv)
I (w+1), (A
Uv)
I (w+2), (A
Uv)
I (w+3)Be 32 fixed point vectors that 48 pixel values consist of, e
w=(1,1,1,1) is 32 fixed point vectors that 4 unit picture element values consist of;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation;
For p processor unit of vector processor calculates simultaneously based on SIMD
The reduction of again result of calculation of p processor unit being fixed a point summation.
4. according to claim 1 and 2 or the vectorization implementation method of going average normalizing eliminate indigestion related coefficient of 3 described vector processor-orienteds, it is characterized in that, in the described step (5), A
(u+k) v(k=0,1,2,3) with the computing formula of going average normalizing eliminate indigestion related coefficient of scheming in real time B are:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2011102133381A CN102411773B (en) | 2011-07-28 | 2011-07-28 | Vector-processor-oriented mean-residual normalized product correlation vectoring method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2011102133381A CN102411773B (en) | 2011-07-28 | 2011-07-28 | Vector-processor-oriented mean-residual normalized product correlation vectoring method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102411773A CN102411773A (en) | 2012-04-11 |
CN102411773B true CN102411773B (en) | 2013-03-27 |
Family
ID=45913839
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2011102133381A Active CN102411773B (en) | 2011-07-28 | 2011-07-28 | Vector-processor-oriented mean-residual normalized product correlation vectoring method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102411773B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104238994B (en) * | 2014-09-01 | 2017-07-04 | 中国航天科工集团第三研究院第八三五七研究所 | A kind of method for improving coprocessor operation efficiency |
CN104699458A (en) * | 2015-03-30 | 2015-06-10 | 哈尔滨工业大学 | Fixed point vector processor and vector data access controlling method thereof |
CN109165734B (en) * | 2018-07-11 | 2021-04-02 | 中国人民解放军国防科技大学 | Matrix local response normalization vectorization implementation method |
CN109712173A (en) * | 2018-12-05 | 2019-05-03 | 北京空间机电研究所 | A kind of picture position method for estimating based on Kalman filter |
CN114155562A (en) * | 2022-02-09 | 2022-03-08 | 北京金山数字娱乐科技有限公司 | Gesture recognition method and device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1349159A (en) * | 2001-11-28 | 2002-05-15 | 中国人民解放军国防科学技术大学 | Vector processing method of microprocessor |
CN101833468A (en) * | 2010-04-28 | 2010-09-15 | 中国科学院自动化研究所 | Method for generating vector processing instruction set architecture in high performance computing system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8934539B2 (en) * | 2007-12-03 | 2015-01-13 | Nvidia Corporation | Vector processor acceleration for media quantization |
-
2011
- 2011-07-28 CN CN2011102133381A patent/CN102411773B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1349159A (en) * | 2001-11-28 | 2002-05-15 | 中国人民解放军国防科学技术大学 | Vector processing method of microprocessor |
CN101833468A (en) * | 2010-04-28 | 2010-09-15 | 中国科学院自动化研究所 | Method for generating vector processing instruction set architecture in high performance computing system |
Also Published As
Publication number | Publication date |
---|---|
CN102411773A (en) | 2012-04-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220365753A1 (en) | Accelerated mathematical engine | |
CN102141976B (en) | Method for storing diagonal data of sparse matrix and SpMV (Sparse Matrix Vector) realization method based on method | |
CN102411773B (en) | Vector-processor-oriented mean-residual normalized product correlation vectoring method | |
CN102411558B (en) | Vector processor oriented large matrix multiplied vectorization realizing method | |
CN103336758A (en) | Sparse matrix storage method CSRL (Compressed Sparse Row with Local Information) and SpMV (Sparse Matrix Vector Multiplication) realization method based on same | |
CN103294648B (en) | Support the partitioned matrix multiplication vectorization method of many MAC operation parts vector treatment device | |
TWI690896B (en) | Image processor, method performed by the same, and non-transitory machine readable storage medium | |
CN102509071B (en) | Optical flow computation system and method | |
EP3093757B1 (en) | Multi-dimensional sliding window operation for a vector processor | |
CN103745447B (en) | A kind of fast parallel implementation method of non-local mean filtering | |
CN114092336B (en) | Image scaling method, device, equipment and medium based on bilinear interpolation algorithm | |
CN102158694A (en) | Remote-sensing image decompression method based on GPU (Graphics Processing Unit) | |
Kunz et al. | An FPGA-optimized architecture of horn and schunck optical flow algorithm for real-time applications | |
CN114503126A (en) | Matrix operation circuit, device and method | |
CN102231202B (en) | SAD (sum of absolute difference) vectorization realization method oriented to vector processor | |
Palaniappan et al. | Parallel flux tensor analysis for efficient moving object detection | |
CN104504696A (en) | Embedded parallel optimization method for image salient region detection | |
US20230254145A1 (en) | System and method to improve efficiency in multiplicationladder-based cryptographic operations | |
CN102970545A (en) | Static image compression method based on two-dimensional discrete wavelet transform algorithm | |
Amiri et al. | High performance implementation of 2D convolution using Intel's advanced vector extensions | |
US10771089B2 (en) | Method of input data compression, associated computer program product, computer system and extraction method | |
Menant et al. | Optimized fixed point implementation of a local stereo matching algorithm onto C66x DSP | |
CN102012802A (en) | Vector processor-oriented data exchange method and device | |
Fischer et al. | BinArray: A scalable hardware accelerator for binary approximated CNNs | |
Ross et al. | Implementing image processing algorithms for the epiphany many-core coprocessor with threaded mpi |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |