TW200842699A - Signal processing - Google Patents

Signal processing Download PDF

Info

Publication number
TW200842699A
TW200842699A TW97110966A TW97110966A TW200842699A TW 200842699 A TW200842699 A TW 200842699A TW 97110966 A TW97110966 A TW 97110966A TW 97110966 A TW97110966 A TW 97110966A TW 200842699 A TW200842699 A TW 200842699A
Authority
TW
Taiwan
Prior art keywords
value
processor
data
function
coefficient
Prior art date
Application number
TW97110966A
Other languages
Chinese (zh)
Inventor
Michael B Montvelishsky
Original Assignee
Technology Properties Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Technology Properties Ltd filed Critical Technology Properties Ltd
Publication of TW200842699A publication Critical patent/TW200842699A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Complex Calculations (AREA)
  • Image Processing (AREA)

Abstract

A system (1) for calculating a convolution of a data function with a filter function utilizing an array (12) of processors (14) including first and last processors. A coefficient value based on a derivation of the filter function and a data value representative of the data function are multiplied to produce a current intermediate value. Except in the first processor, a prior intermediate value is then added to the current intermediate value. Except in the last processor, the data and current intermediate values are then sent to the next processor. Then the last processor's prior intermediate value, if any, is added to its current intermediate value to produce a result value, wherein the result values collectively are representative of the convolution of the data function with the filter function.

Description

200842699 訾 九、發明說明: 【發明所屬之技術領域】 本發明係有關於-種信號處理器,及一種信號處理方 法0 【先前技術】 許多現存及新生的系統可使用根據說明基本系統的數 學被適當地程式化之現代的數位處理器加以分析。例如, ::於分析線性非時變系統,諸如電路、光學裝置、機械機 構、及許多其他系統’此種分析今天是越來越有用。 在數學及諸如今曰科學及工程的多數分支之許多廣泛 =它=中’名詞,,變換,,被用以指一種方程式分析 技術。史換的概念回溯至數學的泛函 齡沾办m ‘ 八王要處理函 數的工間的研九,其中,一特別 為苴夂赵 双,、百另一函數以做 。攸而’變換可被用於單獨的方程式或整組 式’其中’變換的過程係在 ^ 在-不同的域中的另一方程t :不的原來方程式至 T叼力方私式之一對一的映射❶ 執行變換的動機通常是直接 以用它們;f來的> _ & 有斗夕方程式,其難 法被更輕易地解答,,變換可被執::=二表示 積分變換之通 後一逆變換被執行以將解映射回原來的域卜解,然 常的形式被定義為·· (1) g(a^)/m(a9t)dt a200842699 訾 、 发明 发明 发明 发明 发明 发明 发明 发明 发明 发明 发明 发明 发明 发明 发明 发明 发明 发明 发明 发明 发明 发明 【 【 【 【 【 【 【 【 【 【 【 【 【 【 【 【 【 【 【 【 【 【 【 【 【 【 【 Properly stylized with modern digital processors for analysis. For example, :: in the analysis of linear time-invariant systems, such as circuits, optical devices, mechanical mechanisms, and many other systems, such analysis is becoming more and more useful today. Many of the broadest branches of mathematics and such as today's science and engineering = it = medium noun, transformation, are used to refer to an equation analysis technique. The concept of history change back to the mathematical function of mathematics. The age of the stipulations of m ‘ eight kings to deal with the function of the work of the nine, one of which is especially for Zhao double, one hundred other function to do. ' ' 'transformation can be used for a single equation or a whole set of 'where the 'transformed process is in another ^ in the different domain t: not the original equation to one of the T 叼 方 私 私The mapping of a ❶ is usually directly used to use them; f > _ & has a formula, the difficulty is more easily answered, the transformation can be carried out:: = two represents the integral transformation An inverse transform is performed to map the solution back to the original domain, and the usual form is defined as (1) g(a^)/m(a9t)dt a

八中,κ(α,·〇通常被稱為變換的,,積分核” 3019-9547-PF 5 200842699 拉普拉斯(Laplace)變換係由方 類的一子集且其通常特別有用。、給定一J疋義的變換種 之一簡單的數學或函數敘述,拉普拉: 函數敘述,其可簡化對系統行為的分析。的 一般的形式被定義為·· 曰 欠換之 ⑵ z(/(〇}=Je f(t)dt οVIII, κ (α,·〇 is often referred to as transformed, integral nucleus) 3019-9547-PF 5 200842699 The Laplace transform is a subset of the class and is usually particularly useful. Given a simple mathematical or functional narrative of a J 变换 transformation, Lapla: a functional narrative that simplifies the analysis of system behavior. The general form is defined as · 换 (2) z ( /(〇}=Je f(t)dt ο

二0、取代、⑽…,時,積分及 界限由方程式⑴被重新定義。僅當s夠大且特定 ==可使用普拉斯變換,但這些條 有足夠的彈性以使竭可具有實際上發現的幾 乎何有用的函數之函數形式。 , 特疋函數,例如,不是單一已知函數的變 、,但可被表示為兩個函數的積,其各自分別係已知函數 /Γί;或^的變換的結果。也就是, ⑶ F(s)^mg(S) 其中,g(t)必須滿足與f(t)相同的條件。由在F(s)、f(t) 及g⑴間的此關聯,了列的關係適用: ) ^(*y) - L{ |/(^ _ T)g(r)dT} ο 其通 4 被稱為,,摺積定理,,(THE CONVOLUTION THEOREM)。 可觀察到摺積定理導致僅一個變數的積分之變換。因 ^僅Μ ’變數的積分之數值近似的技術可被應用。 下歹】的等式適用於積分表示法及黎曼和(Riemann sum) 表示法之間:20, substitution, (10)..., hour, integral and bound are redefined by equation (1). The Plass transform can be used only if s is large enough and specific ==, but these strips are flexible enough to have a functional form of a function that is actually found to be useful. A special function, for example, is not a variant of a single known function, but can be represented as a product of two functions, each of which is the result of a transformation of a known function /Γί; or ^, respectively. That is, (3) F(s)^mg(S) where g(t) must satisfy the same condition as f(t). From this association between F(s), f(t), and g(1), the relationship of the columns applies: ) ^(*y) - L{ |/(^ _ T)g(r)dT} ο It is called, the theorem of the CONVOLUTION THEOREM. It can be observed that the convolution theorem results in a transformation of the integral of only one variable. A technique similar to the numerical approximation of the integral of only the ’' variable can be applied. The equation for 歹 applies to the integral notation and the Riemann sum representation:

3019-9547-PF 6 200842699 ⑶ )f(t - r)g(T)dr = f{〇t_k )g(c )Δτ3019-9547-PF 6 200842699 (3) )f(t - r)g(T)dr = f{〇t_k )g(c )Δτ

0 k=Q 其中’各及〇係在第々個子區間中被任意選擇。實際 上,在方程式(5 )之等式的右手邊係由使用非常小的Δτ並且 了解到存在有取決於選擇的數值技術之某一階的誤差項及 △ τ的值而被接近。從而: ⑹ SS / ^ct-k )s(ck )Δτ = 2] f{ctk )g{ck )Ατ + 0(Arm) k=Q /:=0 • 其中,m係可由總和表示的準確性的階(且也係可被預期的 精確度的數目)且〇係傳統數學中的大〇記號。 如上暗示,在可由使用摺積得益的變換之重要應用中 有現存的及可能的用途。例如,一個此種應用係結合在數 位k號處理(DSP)中執行的數位濾波使用摺積。 可被表示為一數學函數的任何濾波可通過使用數位渡 波器被實現,且這是現代DSP實施的一項基礎。例如,從 一信號被取樣的資料值之數位濾波容許移除信號之不想要 •的部分或是擷取信號之有用的部分。有限脈衝響應(fir) 及無限脈衝響應(HR)係被使用於今曰DSP應用中之兩種 主要類型的數位濾波器,且較常見的是FIR濾波器。 因為不需要内部回饋,其會例·如使得一丨IR瀘波器不 確定地響應一脈衝,FIR濾波器通過被認為有利於使用。 其名稱中之,,有限,,也意味著Fir濾波器的另一項優點。 來自此一濾波器的脈衝最終固定為零,且在使用的迭代加 總計算中的誤差未傳播。也就是,誤差項在整個計算過程 中保持固定。這是勝過丨IR濾波器的一項明顯的優點,例0 k = Q where 'each and 〇 are arbitrarily selected in the third subinterval. In fact, the right hand side of the equation of equation (5) is approximated by the use of a very small Δτ and the knowledge that there is a certain order of error term and Δ τ depending on the selected numerical technique. Thus: (6) SS / ^ct-k )s(ck )Δτ = 2] f{ctk )g{ck )Ατ + 0(Arm) k=Q /:=0 • where m is the accuracy expressed by the sum The order (and also the number of precisions that can be expected) is the big mark in traditional mathematics. As suggested above, there are existing and possible uses in important applications that can be used to convert from gains. For example, one such application uses a combination of digital filtering performed in digital k-processing (DSP). Any filtering that can be represented as a mathematical function can be implemented using a digital ferrite, and this is a basis for modern DSP implementation. For example, digital filtering of data values sampled from a signal allows removal of unwanted portions of the signal or useful portions of the captured signal. The finite impulse response (fir) and infinite impulse response (HR) are used in two main types of digital filters in today's DSP applications, and the more common ones are FIR filters. Since internal feedback is not required, it would be such that, for example, an IR chopper responds unresponsively to a pulse, and the FIR filter pass is considered to be advantageous for use. Its name, limited, also means another advantage of the Fir filter. The pulse from this filter is finally fixed to zero and the error in the iterative total calculation used is not propagated. That is, the error term remains fixed throughout the calculation. This is an obvious advantage over the 丨IR filter, for example

3019-9547-PF 7 200842699 如’在11R濾波器中,對於各個額外的迭代輸出總和,誤 差可能會增加。 不幸地’對於許多應用,數位濾波器的主要限制係在 於其速度受到用於數值計算的處理器的速度之限制。例 如’若需要高渡波速度’這會使得完成數位濾波器需要的 硬體〒貝或者就是無法達成。對於差不多所有的應用,且 對於夕數電子系統通常為真,即使用的速度越高,其變得 φ 越難以處理一致的效應,諸如抑制電磁雜訊及散熱。 因此,改善我們用以執行數值摺積計算的系統將使得 我們可用更间的速度、更經濟地、且在基本及外圍系統中 /、有減^之不利效應’執行有關當前及新的信號處理的工 【發明内容】 一種包括多電腦處理器之摺積 本發明的一實施例使用3019-9547-PF 7 200842699 As in the 11R filter, the error may increase for each additional iteration output sum. Unfortunately, for many applications, the primary limitation of digital filters is that their speed is limited by the speed of the processor used for numerical calculations. For example, 'If high wave speed is required', this will make it impossible to complete the hardware mussels required for the digital filter. For almost all applications, and for a fractional electronic system it is usually true, that is, the higher the speed of use, the more difficult it is to handle consistent effects such as suppression of electromagnetic noise and heat dissipation. Therefore, improving the system we use to perform numerically-folded calculations will allow us to perform current and new signal processing with more speed, more economical, and in the basic and peripheral systems. SUMMARY OF THE INVENTION A method including a multi-computer processor is used in accordance with an embodiment of the present invention.

計算的系統。 的中間值加到現在的中間值 中 邏輯被提供以將資料 本發明之第一特徵提供 函數之摺積的系統。處理器 最後處理器且其各自包括一 導數之一係數值及表示資料 現在的中間值。在第一處理 提供以接收一前面的中間值 先w執行的計算,及將前面 在最後處理器之外的處理器 一種用於計算資料函數及濾 的陣列被提供,其包括第一 邏輯,用以將根據濾波函數 函數之一資料值相乘以產生 器之外的處理器中,一邏輯 ,其表示在處理器的另一個Calculated system. The intermediate value is added to the current intermediate value. The logic is provided to provide a system for the first feature of the present invention to provide a product of the product. Processor The last processor and each of which includes a coefficient value of one of the derivatives and an intermediate value representing the data. The first process provides a calculation to receive a previous intermediate value first w, and an array of processors other than the last processor for computing data functions and filters is provided, including the first logic, In a processor that will multiply the data value according to one of the filter function functions, a logic, which is represented in another processor

3019-9547-PF 8 200842699 及現在的中間值送至另一個處理器。一邏輯被進_步提供 以保存j自最後處理器的一前面的中間值,若有的話,; 做為-前面的部分值,及將此前面的部分值加至來 處理器的現在的中間值以產生一結果值。從而,處理写的 陣列接2-系列的資料值並產生—系列的結果值,其整體 地表示資料函數及濾波函數的摺積。 第一特徵的一實施例提供一信號處理器,其具有用於 從-待處理的信號提供表示信號的資料值之裝置,及用於 ^慮,函數計算表示信號的資料之摺積的該系統。舉例而 言’實施例係一數位濾波器。 本發明的第一特徵也提供一種用於計算在資料函數及 滤波函數的摺積中之結果值的方法。一序列的係數值被得 到,其係根據錢函數㈣數。對於表示諸函數的 ^值’係數值被使用在包括一第一及最後處理器之電腦的 處理器的管線中,以將係數值之一及資料值相乘以產生現 在的":值。除了在第一處理器中’表示在處理器的另一 個中先前執行的計装 T异之刚面的中間值被加到現在的中間 值。除了在最後處理器中’資料值及現在的中間值被送至 一隨後的處理器。一箭 月J面的邛勿值,若有的話,被加到來 自最後處理器之現在的中間值以產生-結果值,其中,此 前面的部分值係來自最後處理器之一先前的中間值。並 且’結果值被輸出至利用此程序的一數位信號處理器。 特徵的一實施例也提供一種處理一信號的方法, 匕括攸-待處理的信號提供表示該信號的資料值,及用於3019-9547-PF 8 200842699 and the current intermediate value is sent to another processor. A logic is provided to save j from a previous intermediate value of the last processor, if any; as a partial value of the previous part, and to add the previous partial value to the current processor The intermediate value is used to produce a result value. Thus, the write array is processed to the 2-series data values and a series of result values are generated which collectively represent the product function and the convolution of the filter function. An embodiment of the first feature provides a signal processor having means for providing a data value representative of the signal from the signal to be processed, and a system for calculating a function of calculating a product of the signal. . By way of example, the embodiment is a digital filter. A first feature of the invention also provides a method for calculating a result value in a product of a data function and a filter function. A sequence of coefficient values is obtained, which is based on the money function (four) number. The value of the coefficient value representing the functions is used in the pipeline of the processor of the computer including a first and last processor to multiply one of the coefficient values and the data value to produce the current ": value. The intermediate value of the front face of the meter which was previously executed in the other processor in the first processor is added to the current intermediate value. Except in the final processor, the data value and the current intermediate value are sent to a subsequent processor. The value of one arrow month J, if any, is added to the current intermediate value from the last processor to produce a - result value, where the previous partial value is from the previous middle of one of the last processors value. And the resulting value is output to a digital signal processor that utilizes this program. An embodiment of the feature also provides a method of processing a signal, the signal to be processed providing a data value representative of the signal, and

3019-9547-PF 9 200842699 以濾波函數計算資料值之摺積的該程序。 本發明的第二特徵提供用於計算資料函數及滤波函數 的摺積的-種過程及對應系'统。—序列的係數值被獲得, 其係根據濾、波函數的莫| θ ^ 反山数的v數,且一序列的資料值被獲得,盆 係表示資料函數。對於各個此種賴值,相對於各個此種 係數值7在包括一第一及最後處理器之電腦的處理器的管 線中:係數值及資料值被相乘以產生一現在的中間值。除3019-9547-PF 9 200842699 This procedure for calculating the product of the data values by the filter function. A second feature of the present invention provides a process and a corresponding system for calculating the product of a data function and a filter function. - The coefficient values of the sequence are obtained, which are based on the number of v | θ ^ of the filter function, and the data value of a sequence is obtained, and the basin represents the data function. For each such value, each such coefficient value 7 is in the pipeline of the processor of the computer including a first and last processor: the coefficient value and the data value are multiplied to produce a current intermediate value. except

^第:處理H中,表示在處理器的另—個中先前執行的 开月』面的中間值被加到現在的中間值。除了在最後處 理器2,資料值及現在的中間值被送至—隨後的該處理 裔。一前面的部分值,若有的話,被加到來自最後處理器 見,的中間值以產生_結果值,其中,前面的部分值係 來自取後處理為之_先前的中間值。這些結果值被輸出至 利用此過程的一數位信號處理器。 /第一特徵的一實施例提供一種信號處理方法及對應的 糸統,包括從一待處理的信號提供表示該信號的資料值, j用於以濾波函數計算資料值之摺積的該過程。舉例而 言,過程係一信號濾波過程。 i本lx明的第二特徵提供一種改良的系統,用以計算在 二中至^處理器將表示濾波函數的一係數值及表示資料 函數的一賣料值相乘的類型之摺積。此改良包括係數值係 根據渡波函數的導數。 第二特徵的一實施例提供一種信號處理器,具有從一 待處理的指號提供表示該信號的資料值之裝置,及用於以^: In process H, the intermediate value of the face of the previous execution of the previous execution in the processor is added to the current intermediate value. Except in the last processor 2, the data value and the current intermediate value are sent to the subsequent processing. A previous partial value, if any, is added to the intermediate value seen from the last processor to produce a _ result value, where the previous partial value is derived from the previous intermediate value. These resulting values are output to a digital signal processor that utilizes this process. An embodiment of the first feature provides a signal processing method and corresponding system comprising providing a data value representative of the signal from a signal to be processed, j for calculating the product of the data value by a filter function. For example, the process is a signal filtering process. The second feature of the present invention provides an improved system for calculating a product of a type in which a coefficient value representing a filter function and a sales value representing a data function are multiplied in the second to the processor. This improvement includes the coefficient values based on the derivative of the wave function. An embodiment of the second feature provides a signal processor having means for providing a data value representative of the signal from a finger to be processed, and for

3019-9547-PF 200842699 遽波函數计算表示該信號的資料之摺積的兮/ 言,實施例傣一數位濾波器。 、^系統。舉例而 本發明的第三特徵也提供一種改良的過程,# 在係數值係表示一濾波函數,資料值係表示一々用於計算 且係數值及資料值有相乘以產生整體上表二料函數, 之類型的電腦的處理器中的摺積。 、的結果值 其係根據濾波函數的導數。 ’、數值’ ❿啼第/特徵的—實施例從-待處理的信號提供表示辟 ^ ,SI ^ _ 卞异貝枓值的摺積之該 過红。舉例而s,過程係一信號濾波過程。 本發明也提供一種電腦程式,其當在電腦的處理器之 陣列中運行時使得陣列執行本發明的第_、第二或第三特 徵的過程。電腦的陣列可位於單一半導體晶粒上。電腦程 式可位於 載體上,-Α» 其了為在一電腦中的一記錄媒體、或 一電號、或一記憶裝置。 • 如在此說明且如在圖式的圖中繪示者,考慮到實現本 發明的模式的說明及最佳實施例之產業利用性,對於熟知 此技藝者’本發明之這些及其他目的及優點將變得更加清 楚。 本表月之目的及優點由下面的詳細說明,結合附圖, 將更加明瞭。 【實施方式】 本表明之較佳實施例係一種以.多電腦處理器執行的3019-9547-PF 200842699 The chopping function calculates the 表示/言 of the data representing the signal, and the embodiment is a digital filter. , ^ system. For example, the third feature of the present invention also provides an improved process, in which the coefficient value represents a filter function, the data value represents a 々 is used for calculation, and the coefficient value and the data value are multiplied to generate an overall table two-material function. , the product of the type of computer's processor. The resulting value is based on the derivative of the filter function. The 'value' ❿啼th/feature-embodiment provides a reddish representation of the product from the -to-be-processed signal representing the convolution of the ^, SI ^ _ 枓 different 枓 value. For example, the process is a signal filtering process. The present invention also provides a computer program that, when run in an array of processors of a computer, causes the array to perform the processes of the first, second or third feature of the present invention. The array of computers can be located on a single semiconductor die. The computer program can be located on a carrier, - a recording medium in a computer, or an electric number, or a memory device. • As described herein and illustrated in the drawings, in view of the description of the modes of the present invention and the industrial applicability of the preferred embodiments, these and other objects of the present invention are well known to those skilled in the art. The advantages will become clearer. The purpose and advantages of this month will be more apparent from the following detailed description, taken in conjunction with the accompanying drawings. [Embodiment] The preferred embodiment of the present invention is implemented by a multi-computer processor.

3019-9547-PF 11 200842699 且特別是在 一般的參考 摺積計算之系統。如在此之不同圏式中說明, 圖1的圖式中’本發明之較佳實施例係描述為 簡單地說,本發明係一種改良的擅積系統1〇, 二地接近資料函數及滤波函數的摺積的解。使用數值技術 執仃摺積計算本來就傾向需要大量的乘法及加法運算。 發明容相兩個特別的方式大幅地減少執行此:3019-9547-PF 11 200842699 and especially in the general reference fold calculation system. As illustrated in the various figures herein, the preferred embodiment of the present invention in the drawings of FIG. 1 is described as simply saying that the present invention is an improved tampering system 1 〇, two proximity data functions and filtering The solution of the product's product. Using numerical techniques Stubborn calculations tend to require a large number of multiplications and additions. Two special ways to invent the phase of the invention greatly reduce this:

的整體時間。首先,W許平行地完成 _斤而 “ 爭歹J地其',本發明容許使用新種類的演算 法⑷糾Hhm),其使用濾波值及可用較少資料位元表示的 ^值’如此’考慮處理器之固有的限制,其可被 地執行。 < 圖1係描述被使用在電腦處理器14的陣列12中之發 明的摺積系統10的圖式。注意摺積系统ι〇本身,支撐陣 列12的外部元件一般被省略或描繪。不過,熟知此技^者 _ 將理解此種元件將出現在實際操作的實施例中且它們事實 上通常可為常見的。例如,圖!省略有關提供陣列Μ電力 的所有細節且包括一般形式的外部輸入裝置16、輸入匯流 排18、輸出匯流排20、及外部輸出裝置22。為了簡化表 不,一般的計算的初始化及終止最初也未討論,且程式指 令及摺積係數值被當作已經載入至處理器14。輸入裝置^ 在此被視為僅關於提供摺積將據以執行的輸入資料值且輪 出裝置22在此被視為僅關於接收摺積此以執行的輪出資 料值。輸入資料值係待處理(例如濾波)的信號之樣本。輪The overall time. First, W is allowed to complete in parallel, and "for the sake of J", the present invention allows the use of a new type of algorithm (4) to correct Hhm), which uses the filtered value and the value of the ^ which can be represented by fewer data bits. Considering the inherent limitations of the processor, it can be performed. <Figure 1 is a diagram depicting the deconvolution system 10 of the invention used in the array 12 of computer processors 14. Note the convolution system ι〇 itself, The external components of the support array 12 are generally omitted or depicted. However, it is understood that such components will appear in practical embodiments and they may in fact generally be common. For example, the figures are omitted. All details of the array power are provided and include the general form of external input device 16, input bus 18, output bus 20, and external output device 22. To simplify the table, the initialization and termination of the general calculations are not initially discussed. And the program instruction and the value of the convolution coefficient are deemed to have been loaded to the processor 14. The input device ^ is here considered to be only relevant to the input data value on which the convolution will be provided and the wheeling device 22 is here. Regarded as the only contribution expected value on the reception of this convolution to perform a wheel. The samples of the signal to be processed based input data values (e.g., filtering). Wheel

3019-9547-PF 12 200842699 出貧料值係被處理(例如濾波)的信號樣本。 在輸入裝置16開始且在輸出裝置22結束,圖1文體 上也顯不一流動路徑24。不過,應瞭解與此不同的排列也 疋非#可能的。例如,其他的開始及停止位置是可能的, 與描述的流動路徑24不同的路徑也是可能的(且在替代的 貫施例中甚至是很可能的),且可改用單一之組合的I/O裝3019-9547-PF 12 200842699 The depleted material value is a signal sample that is processed (eg, filtered). At the beginning of the input device 16 and at the end of the output device 22, the flow path 24 is also apparent in the Figure 1 body. However, it should be understood that a different arrangement than this is also possible. For example, other start and stop positions are possible, and paths other than the described flow path 24 are also possible (and even more likely in alternative embodiments), and a single combination of I/ can be used instead. O loading

置(未顯示),例如,具有與陣列12通信之輸入及輸出通道 者0 發明人目前較好的摺積系統10的硬體平台係在單一 晶粒26中具有處理器14的陣列12,諸如inteiiaSysSet (not shown), for example, having input and output channels in communication with array 12. The intrinsic better hardware platform of the inversion system 10 has an array 12 of processors 14 in a single die 26, such as inteiiaSys

Corporation of Cupertino, California SEAforth-24A 或SEAforth 4GA裝置。SEAfc)rth_24A在此被使用於多數 例子中(且在這些例早由 者 抑 子中的處理态14可恰當地被稱為,,核 心”或”節點”)。要推一牛伯4 % a 進 v使讨褅谷易,整組處理器14 的構件被單獨地稱為處理器14a_x,如圖所示,且各自具 有使其可與出現的其他處理器14相互通信之匯流排 雖,圖1所示的各處理器具有將其相互連接至其 的處理4 14之匯流排28,從流動路徑24的路線,可看出 並非所有的匯流排28都必需被使用。事實上 摺積系統10的實施例可、ά祛上& —、 在此颂不的 刃貫%例可被替代地實施為一 列通信的處理器(我們稱為處理器的”管線”): 串 圖2(習知技術)係圖1中的處理器η SEAf orth-24A處理哭沾’亦即是 A處理益的核心的主要内部特 理器…常係—獨立運行的電 :=各處 G祜异術邏輯單Corporation of Cupertino, California SEAforth-24A or SEAforth 4GA unit. SEAfc)rth_24A is used here in most of the examples (and the processing states 14 in these early cases can be properly referred to as "core" or "nodes"). To push a boss 4% a The components of the entire set of processors 14 are referred to individually as processors 14a-x, as shown, and each has a busbar that allows it to communicate with other processors 14 that are present, although The processors shown in Fig. 1 have busbars 28 to which they are connected to each other, and from the route of the flow path 24, it can be seen that not all of the busbars 28 must be used. In fact, the deconvolution system 10 The embodiment may be ά祛 & amp 、 、 、 % % % % 可 可 可 可 可 可 可 可 可 可 可 可 可 % % % % % % % % % % % % % % % % % % % % % % % Technology) is the processor η SEAf orth-24A in Figure 1 to deal with crying, which is the main internal processor of the core of A processing benefits... Often - independent operation of electricity: = everywhere G 术 逻辑 logic

3019-9547-PF 13 200842699 元(ALU 30)、一些唯讀記憶體(R〇M 32)、一些隨機存取記 憶體(RAM 34)、一指令解碼邏輯區段° 資料堆疊4。、及一迴返堆疊42。同樣被包括的係一 18: Ά暫存器U暫存器44)、一9位元” B”暫存器(W 存™ 46) 9位元程式計數暫存器(P暫存器48)、及一 18位元I/O控制及狀態暫存器(I〇cs暫存器5〇)。進一步 被包括的係四通料(整體料52,分別料心,。^ 了邊緣及角落節點的情況,淳52各自連接至一各自的匯流 排、28(且具有18條資料線、—讀取線、及—寫人線_未單 獨被顯示)。 在SEAforth-24A裝置中的節點以特別嚴謹且有效的 方式非同步地操作通信及處理,使得此裝置高度適於與發 明的摺積系統1G之實施例—起使用。不過,應記住使用此 特別的裝置,或者甚至接近其功能的硬體是不需要的。同 樣地’可能需要防範關於資料實際上如何在陣列12中的處 ♦理器”14間通信的誤解。例如,當考慮裝置間的通信時,可 使用推或拉的象徵,但應記住通信實際上是裝置 間的一種合作行為。 圖3a—C係描述使用處理器14之向内、向外、及内部 通信的圖1的部分圖式。圖3a顯示資料如何通過輸入裝置 6及處理器14a之間與處理器} 4a及處理器^ 之間。圖 匕b顯示資料如何通過處理器…及處理器ΐ4χ之間與處理 器14χ及輸出裝置 之間。且圖3 c顯示資料如何通過處 理器14i及處理器i4j之間。3019-9547-PF 13 200842699 yuan (ALU 30), some read-only memory (R〇M 32), some random access memory (RAM 34), an instruction decode logic section ° data stack 4. And returning to stack 42 as soon as possible. Also included is a system 18: Ά register U register 44), a 9-bit "B" register (W memory TM 46) 9-bit program count register (P register 48) And an 18-bit I/O control and status register (I〇cs register 5〇). Further included is the four-way material (the overall material 52, respectively, the center of the material, ^ ^ the edge and corner nodes, the 淳 52 are each connected to a respective bus, 28 (and has 18 data lines, - read The line, and the write line _ are not separately displayed. The nodes in the SEAforth-24A device operate the communication and processing asynchronously in a particularly rigorous and efficient manner, making the device highly suitable for the inventive convolution system 1G. The embodiment is used up. However, it should be borne in mind that the use of this particular device, or even hardware close to its function, is not required. Similarly, it may be necessary to guard against how the data is actually in the array 12. "14" communication misunderstanding. For example, when considering communication between devices, a push or pull symbol can be used, but it should be remembered that communication is actually a cooperative behavior between devices. Figure 3a-C describes the use of a processor Figure 14 is a partial diagram of the inward, outward, and internal communication of Figure 14. Figure 3a shows how data is passed between the input device 6 and the processor 14a and between the processor 4a and the processor ^. How is the information ... between the processor and a processor over the processor ΐ4χ 14χ and the output device. FIG. 3 c, and show how the data between the processor and a processor 14i through i4j.

3019-9547-PF 14 200842699 圖3a-c中的各處理器14被表示為具有—般的關鍵資 讯保存元件。SEAforth_24A裝置以RAM、r〇m、暫存器為特 點,其均可被使用於以程式的方式執行計算。在此,其特 別表示一般的資訊保存元件,其將被討論,可為ram、r〇m'、 暫存器及埠的任—個。在處理器14a的情況中,信號資料 π件60係重要資訊保存元件。在處理器Ubi的情況中, 信號資料元件60、積分核濾波元件62、及計算的元件64 • 之每一個係各自的重要資訊保存元件。且在處理器14x的 情況中,結果元件66係重要資訊保存元件。 現在隨著圖3a繼續進行,這顯示資料如何可進入陣列 12。在此例示之摺積系統1〇的實施例中,處理器^乜專用 於從輸入裝置14接收資料並且將其提供給處理器Ub。如 此,處理器14a可從輸入裝置16接收並儲存資料字且然後 使用信號資料元件60以提供這些資料字的實例給處理器 14b,僅受到其RAM 34的性能及其是否已適當地程式化的 限制。 在SEAf orth-24A裝置中的通信及處理均為非同步,故 一旦處理器14a使得資料可被處理器14b利用,手邊工作 的處理概念上”流動,,通過陣列12的剩餘部分。 圖3b顯示資料如何可從陣列12被擷取。處理器ΐ4χ 在此係被專用於從處理器l4w接收資料且將其提供給輸出 裝置22。如此,處理器14x從處理器Uw接收並儲存資料 字,且然後使用其結果元件66以提供資料字給輸出裝置 22,再次全部僅受到其RAM34的性能及其是否已適當地程3019-9547-PF 14 200842699 Each processor 14 in Figures 3a-c is shown as having a general key information storage element. The SEAforth_24A device is characterized by RAM, r〇m, and scratchpad, which can be used to perform calculations in a programmatic manner. Here, it specifically refers to a general information storage component, which will be discussed, and may be any of ram, r〇m', scratchpad, and 埠. In the case of the processor 14a, the signal data π 60 is an important information holding element. In the case of processor Ubi, each of signal data element 60, integrated core filter element 62, and computed element 64 is a separate important information storage element. And in the case of processor 14x, result element 66 is an important information storage element. Now proceeding with Figure 3a, this shows how the data can enter the array 12. In the embodiment of the illustrated convolution system 1 ,, the processor is dedicated to receiving data from the input device 14 and providing it to the processor Ub. As such, the processor 14a can receive and store the data words from the input device 16 and then use the signal data elements 60 to provide an instance of these data words to the processor 14b, subject only to the performance of its RAM 34 and whether it has been properly programmed. limit. The communication and processing in the SEAf orth-24A device are asynchronous, so once the processor 14a enables the data to be utilized by the processor 14b, the processing at hand is conceptually "flowing" through the remainder of the array 12. Figure 3b shows How the data can be retrieved from the array 12. The processor 专用4 is dedicated to receiving data from the processor 14w and providing it to the output device 22. Thus, the processor 14x receives and stores the data word from the processor Uw, and The result element 66 is then used to provide the data word to the output device 22, again again subject to the performance of its RAM 34 and whether it has been properly routed

3019-9547-PF 15 200842699 μ 式化的限制。 七圖3C顯示信號資料元件60及計算的元件64通常如何 *動在處理器、14b — w之間,並且也顯示總和如何可被累積 儲存在各處理器14中且然後在摺積計算的過程中一起通 ^如如詳述,處理器14b-¥的每一個在此可執行有助於 整體計算的運算。在處理器14b的情況中,此運算使用— 新的輸入身料值(在其信號資料元彳6〇中)及一預先儲存 •的摺積係數值(在其積分核濾、波元件62中)。在ιΜ寺別的例 子中’因為從較早的計算級並沒有什麼是”部分的,,,故 處理器Ub不需要-計算的元件64。不過,為了程式的簡 化處理态可具有載入零的一計算元件64。同樣地, 對於母個節點多摺積係數被處理(目前被討論)的應用,處 理器14b可具有並使用一計算元件64。 接著,在處理器14C-W的情況中,經由使用預先儲存 的摺積係數值、沿著流動路徑24來自其各自之前面的處理 馨器14的輸入資料值、及也來自前面的處理器“的一中間 值,各自執行有助於整個摺積計算的運算。摺積係數值被 保存在各自的積分核濾、波元件62中,輸入資料值被暫時保 存在各自的信號資料元件6G中4中間值被暫時保存在各 自的計算元件64中。 圖2及圖3a-c可被用以更一般地觀看咖〇咐_24八 裝置令的處理If 14a-x料及暫存器如何可如剛才說明一 樣被使用。例如,處理器14a使用其埠52d以將輪入資料 值向右傳遞至處理器14b,其可將之放入其資料堆疊4〇。3019-9547-PF 15 200842699 Limitation of μ. Figure 7C shows how the signal data component 60 and the computed component 64 are typically moved between the processors, 14b-w, and also shows how the sum can be cumulatively stored in each processor 14 and then in the process of the fold calculation. Together, as detailed, each of the processors 14b-¥ can perform operations that facilitate overall calculations. In the case of processor 14b, this operation uses a new input body value (in its signal data element 〇6〇) and a pre-stored convolution coefficient value (in its integral core filter, wave element 62) ). In the other example of ιΜ寺, 'because there is nothing from the earlier calculation level,' the processor Ub does not need the -computed component 64. However, the simplified processing state for the program can have zero loading. A computing component 64. Similarly, for applications where the parent node multi-fold factor is processed (discussed now), the processor 14b can have and use a computing component 64. Next, in the case of the processor 14C-W By using the pre-stored value of the convolution coefficient, the input data values from the processing fronts 14 of the flow path 24 from their respective front faces, and an intermediate value also from the previous processor, each execution contributes to the entire The operation of the discount calculation. The value of the convolution coefficient is stored in the respective integral kernel filter and wave element 62, and the input data values are temporarily stored in the respective signal data elements 6G. The intermediate value is temporarily stored in each of the calculation elements 64. Figures 2 and 3a-c can be used to more generally view the processing of the 〇咐 24 24 device command If 14a-x and how the register can be used as just explained. For example, processor 14a uses its port 52d to pass the wheeled data value to the right to processor 14b, which can be placed in its data stack.

3019-9547-PF 16 200842699 然後,處理器14b使用現在在其資料堆疊4〇中的輸入資料 值及已經在其資料堆疊4〇巾的摺積係數值執行有助於摺 積的運算’且然、後處理@ 14b由此將中間資料值放在其蜂 52d。 然後,類似的操作可在處理器14b_w中沿著流動路徑 24發生。雖然在SEAf〇rth_24A裝置中的節點非同步地操 作,處理器14bi中的操作在此全部在概念上可被視為平 行地發生。因此,實質上與剛才對處理器Ub說明者同時, 類似的操作可在處理器14i及Uj中發生,例如,只有這 些將使用各自的摺積係數值,處理中間資料值,及使用它 們沿著流動路徑24的埠52。同樣實質上同時,處理器uw 將在其埠52c使得輸出資料值可用於處理$ 14χ以如上述 操作。不過,再次應注意RAM、R〇M、暫存器、及埠全部可 被以程式的方式使用在漏Qrth_24A裝置中且此前面的 例子僅係處理器14可被程式化以達成相同結果的許多方 式之一 ° 圖4a-f係概述在諸如圖j者之處理器14的陣列Μ中 的摺積計算級的方塊圖。通常,級在此需要: (1)平行地將資料樣本值及摺積係數值相乘; (2 ) 5十算來自級(1)的乘積的總和; (3) 通過陣列12(亦即,通過管線)移位資料樣本值, 接收進入第一節點的下一資料樣本值並且丟棄來自 最後節點的資料樣本值;及 (4) 依需求重覆(亦即,如現在更詳細地說明)。3019-9547-PF 16 200842699 Then, the processor 14b performs the operation that contributes to the convolution using the input data value now in its data stack 4 and the value of the convolution coefficient of the data stack 4 After processing @ 14b, the intermediate data value is placed on its bee 52d. A similar operation can then occur along the flow path 24 in the processor 14b_w. Although the nodes in the SEAf〇rth_24A device operate asynchronously, the operations in processor 14bi are here all conceptually considered to occur in parallel. Thus, substantially similar operations can be performed in processors 14i and Uj, just as the processor Ub has just explained, for example, only these will use the respective convolution coefficient values, process the intermediate data values, and use them along The enthalpy 52 of the flow path 24. Also substantially simultaneously, the processor uw will cause the output data value to be used to process $14 at its 埠52c to operate as described above. However, it should again be noted that RAM, R〇M, scratchpad, and 埠 can all be used programmatically in the drain Qrth_24A device and this previous example is only a number of processors 14 that can be programmed to achieve the same result. One of the modes FIG. 4a-f is a block diagram summarizing the calculation stages of the folds in the array 处理器 of the processor 14 such as FIG. Usually, the stage needs to: (1) multiply the data sample value and the convolution coefficient value in parallel; (2) calculate the sum of the products from the level (1); (3) pass through the array 12 (ie, Shifting the data sample value by pipeline, receiving the next data sample value entering the first node and discarding the data sample value from the last node; and (4) repeating as needed (ie, as explained in more detail now).

3019-9547-PF 17 200842699 因為SEAf orth-24A裝置以RAM、R0M、暫存器、及埠 為特點,其全部可被用於以程式的方式執行的計算,且因 為發明的摺積系統10可與具有較小、較大、或其Z性能及 結構之其他裝置一起使用,圖4a_f中的資料儲存元件在b此 一般被稱為”箱”(bins)。要簡化下面的討論,例子在此 使2相同數目(具體而言,各自為22個)的權積係數值、樣 本資料值、及在只際擅積计异中被使用的處理器H。這些 鲁 #式在許多”真實世界”的應用中可能是罕見的,所以對 於替代情況的考慮現在被討論,且在任何情況中,一曰下 面被完全認識,使用其他等式對於熟知此技藝者係簡單的。 圖4a顯示正式的計算在其已經準備開始的級。摺積係 數值(cd··· Cn ;總共n + 1個值,其中,在圖j及3a —c中的 SEAforth-24A裝置中,n = 21)已被載入至箱中(整體被稱為 c箱72,明確地說係c箱零被載入至其他箱中(整 體被稱為d箱74,明確地說係d箱74(。.1〇,且其他箱(整 • 體被稱為r箱76,明確地說係r箱了心...2^)具有最初不 重要的内容。在圖4a-f及下面的討論中,索引開始於零, 且c代表”係數” ,” d”代表,,資料” ,” a”代表” 累積的”中間值,及” r”代表,,結果”。 圖4b顯示第一資料樣本值在其已被接收進入d箱 74((〇的下一級。計算以被儲存在r箱76(())的第一結果值(η) 通過官線的整個長度實質上同時且平行地如圖所示進行。 圖4c顯示前面的資料樣本值(dQ)在其已被移動至d箱3019-9547-PF 17 200842699 Because the SEAf orth-24A device is characterized by RAM, ROM, scratchpad, and UI, all of which can be used for computationally performed calculations, and because the inventive deconvolution system 10 can Used in conjunction with other devices having smaller, larger, or Z-like properties and structures, the data storage elements of Figures 4a-f are generally referred to herein as "bins." To simplify the discussion below, the example hereby makes 2 identical number (specifically, 22 each) of the weight coefficient value, the sample data value, and the processor H used in the exclusive accumulation calculation. These Lu# styles may be rare in many "real world" applications, so considerations for alternative situations are now discussed, and in any case, one is fully understood below, using other equations for those skilled in the art. It is simple. Figure 4a shows the formal calculations at the level at which they are ready to begin. The value of the convolution coefficient (cd··· Cn ; a total of n + 1 values, of which in the SEAforth-24A device in Figures j and 3a-c, n = 21) has been loaded into the box (the whole is called For the c-box 72, specifically, the c-box zero is loaded into the other box (the whole is called the d-box 74, specifically the d-box 74 (..1〇, and the other boxes (the whole body is called For the r box 76, specifically the box r... 2^) has the content that was initially unimportant. In the discussion of Figures 4a-f and below, the index starts at zero and c stands for "coefficient"," d" stands for, data", "a" stands for "cumulative" intermediate value, and "r" stands for, the result". Figure 4b shows that the first data sample value has been received into the d-box 74 ((〇 The next level is calculated by the first result value (η) stored in the r box 76(()) through the entire length of the official line substantially simultaneously and in parallel as shown. Figure 4c shows the previous data sample value ( dQ) after it has been moved to d box

74(u且第二資料樣本值(dl)已被接收進入d箱74(〇)的下一 3019-9547-PF 18 200842699 級。計算以現在被儲存在r |g 76n)的第二結果值(ri)通 管線的整個長度實質上同時且平行地如圖所示再次進行。 在圖4c及圖4d之間係、刀|級,其在概念上係报像剛 說明的級。 圖4d顯不最後的資料樣本值(dn)在其已被接收進入d 箱74(。)的級。計算再次以現在被儲存在犷箱76⑷的—結果 值(rn)如圖所示進行。 藝 圖4e顯示下一級。現在已部分地處理全部的万"個資 料樣本值(d〇".dn),最後的資料樣本值(dn)已被移動進入己 箱74⑴且零的值被放進4箱”⑴。計算繼續進行且一結果 值(rn+1)被儲存在r箱76(n+1)中。 在圖4e及圖4f之間係另外的刀―之級,其概念上彳艮像 剛說明的級。 圖4f顯不最後資料樣本值(dn)在其最終地結束被處理 的級。在在此的計算之後,第(/7幼_7)個結果值被儲存在『 .1 76UH)中且處理完成。Γ箱76(^"現在保存根據 資料樣本值(do..· d„)及/70摺積係數值c〇在此執行的 摺積計算之完全的結果。 圖5a-f係概述根據再次出現在諸如圖1者之處理器 14的陣列12中之一新的演算法的摺積計算級的方塊圖。 簡單地說,新的演算法使用濾波函數的導數。為了強調此 點’在此使用的摺積係數值係不同地標示為c,。…c,①(其 中,在圖1及3a-c的SEAfor1:h-24A裝置中,m=2i ;、、主音 當對在此的索引使用不同的參考時使用的理由現在被討 3019-9547-PF 19 200842699 論)。 圖5a顯示正式的計算在其已經準備開始的級。導數擅 積係數值(c,。…c’ 〇0已被載入至箱中(整體被稱為c,箱 82,明確地說係c’箱82U.10,零被載入至其他箱中(整 體被稱為d箱84,明確地說係d箱84(。·.. π)),且星一,, 干 Ρ相 86及一組結果箱(整體被稱為r箱88,明確地說係r箱 88(〇m)具有最初不重要的内容。有點像前面的例子,索 引開始於零,且,,c,’,代表,,係數,,的導數,” d,,再次 代表”資料” ,” a”再次代表,,累積的,,中間值,” 代表”部分’’(跟有助於結果的,,部分,,一樣),且” 代 表結果”。 圖5b顯不第一資料樣本值(d。)在其被接收進入廿箱 84m的下一級。計算以被提供至p箱86的第一部分值(^) 通過管線的整個長度實質上同時且平行地如圖所示進行。 不過,不像使用圖4a-f中所示的傳統摺積演算法之變化 形,先前的部分值在此被加至現在的部分值,且然後 這被儲存在『箱⑽⑷中。因為在此早期的級沒有什麼是,, 先前的’’,不過,零被加到第一部分值(p〇以計算被儲存 在r箱88m中的第一結果值(r。)。 圖5c顯示前面的資料樣本值(d。)在其已被移動至d箱 84⑴且第二f料樣本值(di)已被接收進入d肖84(。)的下_ 級。計算以現在被提供i 86的第二部分值(ρ〗)通過管 線的整個長度實質±同時且平行地如圖所示再次進行。先 前的部分值(p〇)被加至現在的部分值(ρι),且然後這被儲存74 (u and the second data sample value (dl) has been received into the next 3019-9547-PF 18 200842699 level of the d-box 74 (〇). The second result value calculated to be stored in r |g 76n) The entire length of the (ri) pass line is again substantially simultaneously and in parallel as shown. Between Figure 4c and Figure 4d, the knives, grades, are conceptually reported as just stated. Figure 4d shows the last data sample value (dn) at the level it has been received into the d-box 74 (.). The calculation is again performed with the result value (rn) now stored in the box 76 (4) as shown. Art Figure 4e shows the next level. All of the data samples (d〇".dn) have been partially processed, and the last data sample value (dn) has been moved into the box 74(1) and the value of zero is placed in 4 boxes" (1). The calculation continues and a result value (rn+1) is stored in the r-box 76(n+1). There is another knife-level between Figure 4e and Figure 4f, which is conceptually just like Figure 4f shows the final data sample value (dn) at the end of its final processed level. After the calculation here, the (/7 young_7) result values are stored in ".1 76UH". And the processing is completed. The box 76 (^" now saves the complete result of the calculation of the fold performed here according to the data sample value (do..·d„) and /70% product coefficient c〇. Figure 5a-f A block diagram that summarizes the calculation steps of a new algorithm based on a new algorithm that appears again in array 12, such as processor 14 of Figure 1. Briefly, the new algorithm uses the derivative of the filter function. To emphasize this The value of the convolution coefficient used here is denoted by c, . . . , c, 1 (wherein, in the SEAfor1:h-24A device of FIGS. 1 and 3a-c, m=2i ; The reason why the lead is used when using different references to the index here is now discussed in 3019-9547-PF 19 200842699. Figure 5a shows the formal calculation at the level at which it is ready to begin. Derivative arbitrage coefficient values (c , ....c' 〇0 has been loaded into the box (the whole is called c, box 82, specifically the c' box 82U.10, zero is loaded into the other box (the whole is called d box) 84, specifically, the d box 84 (... π)), and the star one, the dry phase 86 and a set of result boxes (the whole is called the r box 88, specifically the box r 88 (〇 m) has content that is not initially important. A bit like the previous example, the index starts at zero, and,, c, ', represents, the coefficient, the derivative, "d, again represents the "material", "a" again Representation, cumulative, intermediate value, "representative" part '' (same as contributing to the result, part, and), and "representing the result". Figure 5b shows the first data sample value (d.) At the next stage where it is received into the box 84m. The calculation is to provide the first partial value (^) of the p-box 86 through the entire length of the pipeline. Simultaneously and in parallel as shown in the figure. However, unlike the variation of the conventional convolution algorithm shown in Figures 4a-f, the previous partial values are added here to the current partial values, and then this It is stored in the box (10) (4). Because there is nothing in this early stage, the previous '', however, zero is added to the first part value (p〇 to calculate the first result value stored in the r box 88m) (r.) Figure 5c shows that the previous data sample value (d.) has been moved to d-box 84(1) and the second f-sample value (di) has been received into d-Shao 84 (. The next _ level. The calculation is performed again with the entire length of the pipe line by the second portion of the value (p) that is now provided i 86 ± simultaneously and in parallel as shown. The previous partial value (p〇) is added to the current partial value (ρι), and then this is stored

3019-9547-PF 20 200842699 做為在r箱88⑴中的一第二結果值(n)。 在圖5c及圖5d之間係思-之級,其在概念上报像剛說 明的級。 圖5d顯示最後的資料樣本值(d〇在其已被接收進入^ 相84((υ的級。計算再次以現在被儲存在Γ箱88⑷的—第双 個結果值(r m )如圖所不進行。 圖5e顯示下一級。現在已部分地處理全部的瓜"個資 φ 料樣本值(d{r"d〇,最後的資料樣本值(d〇已被移動進入d 箱84⑴且零的值被放進己箱“⑷。計算繼續進行且一結果 值(rm+1)被儲存在r箱88(m+1)中。 在圖5e及圖5f之間係另外的w級,其概念上很像 剛說明的級。 圖5f顯不最後資料樣本值(I)在其最終地結束被處理 的級。在在此的計算之後,第(__加結果值(〜)被儲 存在I·箱88⑴中且處理完成。Γ箱88(。2m ”現在保存根 釀據心個資料樣本值(de...d〇及心個導數摺積係數值 (C’ 〇在此執行的摺積計算之完全的結果。 總結而言,從上述,現在應清楚發明的摺積系統1〇容 許平行地而非串列地完成必須的計算之重要的部分。例 如,在剛才說明之簡化的例子中,22個處理器14平行地 執打計算。注意,在SEAf〇rth_24A裝置中的所有Μ個處 理器14也可被使用,但因為處理器Ua及處理器ι4χ將必 須具有二功能以適用於計算及1/〇。 如同在本段開始處提到,發明的摺積系統10也容許使3019-9547-PF 20 200842699 as a second result value (n) in the r box 88(1). Between the figure 5c and the 5d, it is conceptually reported as the level just stated. Figure 5d shows the final data sample value (d〇 after it has been received into phase 84 ((υ level. The calculation is again stored in the box 88(4) now - the double result value (rm) is not shown Figure 5e shows the next level. Now all the melons are processed in part (d{r"d〇, the last data sample value (d〇 has been moved into d box 84(1) and zero The value is placed in the box "(4). The calculation continues and a result value (rm+1) is stored in the r box 88 (m+1). There is another w level between Figure 5e and Figure 5f, the concept It is very similar to the level just described. Figure 5f shows the last data sample value (I) at the end of its final processing level. After the calculation here, the (__ plus result value (~) is stored in I · Box 88 (1) and the processing is completed. The box 88 (. 2m ” now saves the data sample value (de...d〇 and the heart derivative value of the derivative value (C' 折 the discount product performed here) The complete result of the calculation. In summary, from the above, it should now be clear that the inventive deconvolution system 1〇 allows the calculation of the necessary calculations to be done in parallel rather than in series. For example, in the simplified example just described, 22 processors 14 perform calculations in parallel. Note that all of the processors 14 in the SEAf〇rth_24A device can also be used, but because of the processor Ua and processor ι4χ will have to have two functions to suit the calculation and 1/〇. As mentioned at the beginning of this paragraph, the inventive deconvolution system 10 also allows

3019-9547-PF 21 200842699 用新種類的演算法’現在討論其特徵。圖^係描述使用 =現的二方法執行的摺積之圖式…,這些圖提供 出現在圖4a-f及圖5a_f者的概念簡介。 圖6a顯示表示使用傳統指積係數的第一圖形 (trace)92及表示使用導數 指積係數(deviation ⑶物㈣⑽⑽⑴仏咖的第二圖形把亦即’可 被:種類的演算法使用者且其可被使用於本發明。圖6b顯 :一據以使用至今出現的二方法執行之輸入資料的 早一圖形94(在此的另一圖形 一主 94現在被討論)。圖6c顯 不表不至今被討論的二方法之結果的單—圖形96。在圖 顯示的特別例中,圖形92係由下列方程式表示: ⑺ w(〇 = 10cos2 2 圖幵/ 92係由u’ (t)表示。圖形94 # Α He 口 $ 94係由下列方程式表示 int (8) v(r) = 〇.8cos =94’…⑴表示。在此,t係以步階大小。,01 ^義n至1的間隔中’"以i的步階大小從uc, 且“糸通過滤波函數的資料點的數目(在此例為2_。整 =而言’圖6a-c顯示非常相同的結果如何可使用至今討論 的方法的任一個被達成’該結果被顯示為在此的圖形i 其他方法也可能使用新種類的演算法。例如,再次夫 閱圖6a-c,導數資料函數,例如被表示為圖形μ者,及 -傳統濾波函數’例如被表示為圖形92者,可被使 圖形96中的結果將再次相同。文獻中暗示其他人考慮此方3019-9547-PF 21 200842699 The new type of algorithm is used to discuss its features. Figure 2 is a diagram depicting the products of the products using the current two methods. These figures provide a conceptual overview of the ones appearing in Figures 4a-f and 5a-f. Figure 6a shows a first graph (trace) 92 using conventional index coefficients and a second graph representing the use of derivative index coefficients (deviation (3) (4) (10) (10) (1), which is the user of the type that can be used: It can be used in the present invention. Figure 6b shows an early pattern 94 of input data executed using the two methods that have appeared so far (here another pattern one main 94 is now discussed). Figure 6c shows no The single-graph 96 of the results of the two methods discussed so far. In the particular example shown in the figure, the graph 92 is represented by the following equation: (7) w (〇 = 10cos2 2 Figure 幵 / 92 is represented by u' (t). Figure 94 # Α He 口 $ 94 is represented by the following equation: int (8) v(r) = 〇.8cos =94'...(1). Here, t is the step size., 01 ^yi n to 1 In the interval '" the step size of i is from uc, and "the number of data points passing through the filter function (in this case, 2_. The whole = in terms of 'Figure 6a-c shows how very the same result can be used so far) Any of the methods discussed is reached 'The result is shown as a graph here i other methods are also possible With a new kind of algorithm, for example, again, Figures 6a-c, derivative data functions, such as those represented as graphics μ, and - traditional filter functions 'for example, represented as graphics 92, can be rendered in graphics 96. The results will be the same again. The literature suggests that others consider this side.

3019-9547-PF 22 200842699 法,雖然未以在此揭露的新方式被實施。函數導數的使用 也可更邏輯地被採用。例如,使用資料及濾波函數的導數 係理論上可能。再進一步,使用資料及濾波函數之任一個 或二者的較高階導數也是理論上可能。實際來看,這些方 法可能具有有限的現實世界的效用,但它們仍然被本:明 的精神包含。在圖6a-c中,資料及濾波函數的的導數係由 圖形92’及94’丨示,且具有仍為在圖形96中者的結果。 • ㈣得函數導數須要的作用力導致在任何方法中舍使 ^數時的一些重要的考慮。在摺積計算中,使用二料 ,值的數目通常會超過被使用的係數值的數目。因此, 獲得渡波函數的導齡新杂亦 的導數所需要的少。,ΓΓ用力通常比獲得資料函數 資料函數的導數處可不需要特別的作用力以獲得 少。再者,因為火 特別的情況,但這可能相當稀 4 β i ”、、田對通常不同的資料樣本值執行多重摺積 計算時,濾波係數值麵赍B 4 叮夕重晶積 傷肖力係可多次使用戈同的’獲得渡波係數值的作 在許多應用中’此i用力分期償還,,的作用力。更確切地, 被輸入做為程 在設計方面且濾、波值可 黯⑽-24A_置二數(例如’即使被儲存在像 π裝置之ROM 32中)。 再繼續參閱圖 具有明顯較小的振C’可看出當圖形92, *圖形94, 振幅範圍。實際來:田乾圍:,圖形92及圖形W具有大的 可使用較少的位元示圖形92’ &圖形94,中的值 是數位處理器的固^ :考慮可利用工具的本質,也就 又1時,可理解此點的重要性。雖然3019-9547-PF 22 200842699 Law, although not implemented in the new manner disclosed herein. The use of function derivatives can also be used more logically. For example, the use of data and the derivative of the filter function is theoretically possible. Further, it is theoretically possible to use higher order derivatives of either or both of the data and the filter function. In fact, these methods may have limited real-world utility, but they are still covered by the spirit of Ben: Ming. In Figures 6a-c, the derivatives of the data and filter functions are represented by graphs 92' and 94' and have the results of those still in graph 96. • (iv) The required force of the function derivative leads to some important considerations in the case of ^^. In the calculation of the convolution, two materials are used, and the number of values usually exceeds the number of coefficient values used. Therefore, it is less necessary to obtain the derivative of the new age of the wave function. The force is usually less than the derivative of the data function. Furthermore, because of the special circumstances of the fire, this may be quite dilute 4 β i ”, and when the field performs multiple-fold calculations on the data samples that are usually different, the filter coefficient value is 赍B 4 重 重 重 重 重 伤The system can use Got's 'acquisition of the value of the wave coefficient in many applications'. This is the force of amortization, and more precisely, it is input as the design and the filter and wave value can be reduced. (10)-24A_ sets the binary number (for example 'even if it is stored in ROM 32 like the π device). Continue to refer to the figure with a significantly smaller vibration C' can be seen when the graphic 92, * graphic 94, amplitude range. Actual To: Tian Qianwei: Graphics 92 and Graphics W have large, less-usable bits. The graphic 92' & graphic 94, the value in the digital processor is: Consider the nature of the available tools, also Just another 1 hour, you can understand the importance of this point.

3019-9547-PF 23 200842699 SEAf orth - 24A裝置實際上在許多其他適合的裝置中是相當 優秀’在此我們會繼續使用它來說明關於發明的摺積系統 1 0通常如何能幫助克服某些現代數位處理器中之固有的限 制。 例如’圖形92中的摺積濾波值可能必須使用丨8位元 的值表示,而用於本發明在圖形92,中之接近的值可使用 9位元或更少位元的值表示。類似地,可認為用於圖形μ 中的資料樣本值可能必須使用18位元的值表示,而用於本 發明在圖形94,中之接近的值可使用9位元或更少位元的 值表示j本發明觀察到使用全部9位元的值會增加本發明 的摺積系統10大約4倍(4X)的速度。 關於它們可直接執行運算的值的多少,數位處理器固 有地受到限制。例如,英特爾公司在mi年提出的權4 處理器處理4位元的值,而現在個人電腦中的處理器則直 接處理32或64位元的值。另外,女沾枯 且力外大的值的乘法通常係可 在今天多數處理器中進行的最慢的運算之一。本發明人觀 察到乘法運算會占有摺積演算法6〇_9〇%的執行時間。 SEAf orth-24A裝置並非這也數位虚輝哭 < 一双徂慝理态的一般原理的 例外。其使用Forth語言且依靠以盔您味 罪u無付號值的18位元(或 是符號值的17位元)表示的數字值, 值次者如同以無符號值 的9位元(或是符號值的8位元+ 凡)表不。例如,若一值需要 1 0位元,則其因此必須盥呈雲1 ,s . Β_ /、而要18位元—樣被有效地處 理。再芩閱圖2,可再次看刭名π Λ ρ 。 丹人有到在SEAf0rth_24A裝置中的各 處理器14具有一 18位元的A暫存 仔态44、一 9位元的β暫3019-9547-PF 23 200842699 The SEAf orth - 24A device is actually quite good in many other suitable devices. Here we will continue to use it to illustrate how the inventive folding system 10 can often help overcome some modern The inherent limitations of digital processors. For example, the convolution filter value in pattern 92 may have to be represented using a value of 丨8 bits, and the value used in the present invention in Figure 92 may be represented by a value of 9 bits or less. Similarly, it can be considered that the data sample values used in the pattern μ may have to be represented by values of 18 bits, and the values used in the drawing 94 of the present invention may use values of 9 bits or less. Representation j The present invention observes that using all of the 9-bit values would increase the speed of the convolution system 10 of the present invention by a factor of about 4 (4X). The digital processor is inherently limited in terms of the number of values that they can perform directly on the operation. For example, Intel's right 4 processor in mi year handles a 4-bit value, and now the processor in a PC directly handles 32 or 64-bit values. In addition, the multiplication of a woman's dilute and extraordinarily large value is usually one of the slowest operations that can be performed on most processors today. The inventors observed that the multiplication operation would occupy the execution time of the convolution algorithm 6〇_9〇%. The SEAf orth-24A device is not the same as the number of imaginary crying < a pair of exceptions to the general principle of the state. It uses the Forth language and relies on a numeric value represented by an 18-bit (or a 17-bit symbolic value) that has no value for the helmet. The value is like a 9-bit unsigned value (or The octet + symbol of the symbol value is not shown. For example, if a value requires 10 bits, it must therefore be clouded, s. Β_ /, and 18 bits - effectively processed. Looking again at Figure 2, you can see the alias π Λ ρ again. The Dan people have a processor 18 in the SEAf0rth_24A device with an 18-bit A temporary temporary state 44, a 9-bit beta temporary

3019-9547-PF 24 200842699 存器46、在_ 32及RAM 34中的18位元的寬字元、及 18位元的寬埠μ 〇 在此將處理器14中之Forth語言的二個18位元的值 相乘的等價物需要下面序列的三十六個運算碼(其 中,·表示HOP或非運算指令且” +*,,表示加號-星號 或按位元相乘指令):3019-9547-PF 24 200842699 The memory 46, the 18-bit wide character in _32 and RAM 34, and the 18-bit wide 埠μ 〇 here are two 18 of the Forth language in processor 14. The equivalent of multiplying the value of a bit requires thirty-six opcodes of the following sequence (where ·· represents HOP or non-operating instructions and “+*,” indicating a plus-asterisk or bitwise multiplication instruction):

• 相反地,在此將處理器14中之Forth語言的一 18位元的 值及一 9位元的值相乘的等價物需要下面序列的十八個運 算碼: 在此將處理器14中之Forth語言的二個9位元的值相乘的 等價物需要下面序列的九個運算碼·· (11) 很清楚地,從計算負擔及可達到的速度的觀點,計算方程 • 式(9)是最不喜好的工作,而計算方程式(11)是最喜好的工 作。這可被說成是分別以lx、2x及4x的速度進行。 上述的概念基礎之更嚴格的證明”現在被提出。可由 發明的摺積系統1 0使用之新種類的演算法使用導數表示 (derivative representation)取代積分核的直接表示 (direct representation)(通常參閱例如方程式q)及先 前技術)。下面實質上是方程式(5)的重述: (12) r{t) = jf(t - T)g(r)dr = limA_〇2/(c^)g(c,)Ar 〇 k=0 3019-9547-PF 25 200842699 其中’ /T卜r)表示積分核。不過,假定積分核改由/,「卜〇 表示。這將導致下式: ⑽ r (’)= j/’〇-〇g(r)^r = lim“。t/’D咖) 〇 *=〇 現在假設積分核係一特別的低通濾波器或是可由一低 通濾波器表示,則下面的近似是合理的: (14) / {ct__k) = f(ct^k)-/(ct_k_A7k ) 且因此:• Conversely, the equivalent of multiplying the value of one 18-bit and the value of a 9-bit in the Forth language in processor 14 requires eighteen arithmetic codes of the following sequence: The equivalent of the two 9-bit values of the Forth language requires the following nine arithmetic codes. (11) It is clear that from the point of view of the computational burden and the achievable speed, the equation (9) is The least favorite job, and the calculation of equation (11) is the favorite job. This can be said to be performed at speeds of lx, 2x and 4x, respectively. A more rigorous proof of the above conceptual basis is now proposed. A new kind of algorithm that can be used by the inventive convolution system 10 uses a derivative representation instead of a direct representation of the integral kernel (see, for example, for example Equation q) and prior art). The following is essentially a restatement of equation (5): (12) r{t) = jf(t - T)g(r)dr = limA_〇2/(c^)g (c,) Ar 〇k=0 3019-9547-PF 25 200842699 where ' /T b r) denotes the integral kernel. However, it is assumed that the integral kernel is changed by /, "divination. This will result in the following formula: (10) r ( ')= j/'〇-〇g(r)^r = lim". t/'D coffee) 〇*=〇 Now suppose that the integral kernel is a special low-pass filter or can be represented by a low-pass filter, then the following approximation is reasonable: (14) / {ct__k) = f( Ct^k)-/(ct_k_A7k ) and therefore:

t lim X \f{ct_k) ~ ^ )Ατ = ^^/(^ )Ar __ ^ ^^ 灸=〇 Λ=〇 (15) 在此,很容易看到方程式(14)等號右邊的第一總和為 广㈠不疋這麼明顯的是等號右邊的第二項就正好是來自 正好一個時步々以前之先前的摺積值,且這可用下面方式 表示: δτ^2)Σ f (ct-k-AT )s(ck )Δτ = r(t -At) 因此,下面的關係在使用直接積分核關係的摺積與使 用直接積分核關係之導數的摺積間成立: (1?) Γ,(〇 = Κ〇-^-Δ,)^,(〇 = ,.(〇 + Γ(ί_ΔΓ) 此—項重要的觀察係瞭解使用直接表示的摆積正好 乂 指積及先别计鼻的摺積相加的和。 的。 …和加上任何已被計,算者相等是-樣 中使用之新種類 • · Cn)可被用以 將此應用於可在發明的摺積系 的演算法的背景中,直接據波係數1t lim X \f{ct_k) ~ ^ )Ατ = ^^/(^ )Ar __ ^ ^^ Moxibustion=〇Λ=〇(15) Here, it is easy to see the first right side of the equation (14) equal sign The sum is wide (a). It is obvious that the second item on the right side of the equal sign is exactly the previous convolution value from exactly one time step, and this can be expressed as follows: δτ^2)Σ f (ct- k-AT )s(ck )Δτ = r(t -At) Therefore, the following relationship holds between the product of the direct integral kernel relationship and the derivative of the direct integral kernel relationship: (1?) Γ, (〇= Κ〇-^-Δ,)^,(〇= ,.(〇+ Γ(ί_ΔΓ) This is an important observation that understands that the use of the direct representation of the product is exactly the same as the index and the nose. The sum of the sum of the sum of ... and plus any that has been counted, the equalizer is the new type used in the sample. • Cn) can be used to apply this algorithm to the integrative system of the invention. Directly according to the wave coefficient 1

3019-9547-PF 26 200842699 以下面方式計算導數係數值(c ’3019-9547-PF 26 200842699 Calculate the derivative coefficient value (c ’ in the following manner

C η +C η +

Cl - Co ; C, 2 = C2 - Cl ; i) : C’ =CoCl - Co ; C, 2 = C2 - Cl ; i) : C' =Co

CC

Cn C 1 且C’…=0 - Cn。不過,Sc。的值非零,則我們輕易地 得到c’。= CD _ 〇 = Cfl。類似地’若c。係零,則我們輕 易地得到c,。= 0 — 0 = 0’且沒有理由將導數據波值與 零的值結合,因為其無助於累積的中間值(a。)。此種推論 可被延伸至任何直接遽波值或是任何為零的直接遽波值之 導數’因為若其將無助於在肖節點計算之累積#中間值, 則其不需要將濾波係數值與一特別的節點結合。但是,除 了第一節點之外,在各節點的部分和仍必須被說明。” 現在回到為何指Sπ被使用於圖4a_f及指數續使用 =圖4a-f,這被完成以強調在使用直接型演算法及導數型 演算法間的關鍵區別。如同可在最後段中看到,其中万個 直接渡波係數值U。... en)可被使用,需要個導數係 數值(C ° . . c η+1)。從而,例如,若一個人將21個處 =應用於使用直接型演算法的應用卜對於使用導數型 演算法之相同應用’將需要22個處理器。或者,若一個人 將22個處理器應詩使用直接型演算法的應用中,對於使 用導數型演算法之相同應用,將需要23個處理器。在圖】 及3a c中的SEAf orth-24A裝置的情況中,這將需要留下 個處理益不用或者將1/〇功能結合至一處理器中,使得 23個處理ϋ可被導數型演算法使用。這在使用—η處理 器的裝置(例如SEAf ort卜24Α)時可能造成輕微的問題,但 其在使用一 40處理器的裝置(例如SEAf〇rth_4〇A)時更不Cn C 1 and C'... = 0 - Cn. However, Sc. The value is non-zero, so we easily get c'. = CD _ 〇 = Cfl. Similarly, if c. If it is zero, then we can easily get c,. = 0 — 0 = 0’ and there is no reason to combine the data-wave value with the value of zero because it does not contribute to the accumulated intermediate value (a.). Such inferences can be extended to any direct chopping value or any derivative of direct chopping value of zero' because it would not require a filter coefficient value if it would not contribute to the accumulation of the intermediate value in the Xiao node calculation. Combined with a special node. However, in addition to the first node, the parts of each node must still be explained. Now go back to why Sπ is used in Figure 4a_f and index continued use = Figure 4a-f, which is done to emphasize the key differences between using direct and derivative algorithms. As you can see in the last paragraph. To which 10,000 direct wave coefficient values U.... en) can be used, a derivative coefficient value (C ° . . c η+1) is required. Thus, for example, if one person applies 21 places = The application of the direct algorithm will require 22 processors for the same application using the derivative algorithm. Or, if one person will use 22 processors to apply the direct type algorithm, use the derivative algorithm. For the same application, 23 processors will be required. In the case of the SEAf orth-24A device in Figure and 3a c, this will require a processing benefit or a 1/〇 function to be incorporated into a processor. This allows 23 processes to be used by a derivative algorithm. This can cause minor problems when using a device with an η processor (such as SEAF ort), but it is using a 40 processor device (eg SEAf〇) Rth_4〇A) is less

3019-9547-PF 27 200842699 成問題,且其在使用 得平常。 80 、 96 、 128個等的裝置時迅速地變 另外注意在此的討論係由使用相同數目的摺積係數值 及樣本資料值而被簡化。若當使用直接型演算法時有22個 樣本資料值要處理,則當制導數型演算法時將需要⑴固 樣本資料值要處理。當然,22個實㈣資料值可被使用並 且:填塞”零的第23個值。不過,更通常地,因為大量的,3019-9547-PF 27 200842699 is a problem and it is in normal use. 80, 96, 128, etc. devices change rapidly. Note that the discussion here is simplified by using the same number of convolution coefficient values and sample data values. If there are 22 sample data values to be processed when using the direct type algorithm, then (1) the solid sample data values will be processed when the number algorithm is guided. Of course, 22 real (four) data values can be used and: padding the 23rd value of zero. However, more usually, because of the large number,

可能接近無限的樣本資料值將被使用在多數真實生活中的 應用,所以這將不成問題。 / 間k ’使料數表轉代直接表示的好處在於表示 ¥數的振幅所需要的值通常係遠小於用以表示直接遽波号 _的值。依據SEAf orth_24A裝置所透露者,再次使用 它做為-個例子,在於取代需要全部18位元的資料字以表 不:振幅值’通常9位元的資料字即已足夠。使9位元的 表不可被視為足夠有兩項要求。首先是係數值被表示為9 位元的無符號值(或8位元的符號值)。第二個條件是在連 縯的直接據波係數間的^係由9位元的無符號值(或8位 兀的付諕值)表示。在典型的低通濾波器的情況中,諸如被 制在圖6a-c的例子中者,因為最高有效位元被保留做為 们纽凡’導數值將必須僅由8位元表示。做為一般的規 /#右有問題的值係無符號值或可被做為無符號值,且然 後右在連績的直接係數間的差係小於5〗2單位,導數逼近 疋適爾的方法。否則,在連續的直接係數值間的差必 小於256單位。 ^Sample data values that may be close to infinity will be used in most real-life applications, so this will not be a problem. The advantage of /k' to the direct representation of the number table is that the value required to represent the amplitude of the ¥ number is typically much smaller than the value used to represent the direct chop _. According to the disclosure of the SEAf orth_24A device, it is used again as an example to replace the data words that require all 18 bits to indicate that the amplitude value 'usually a 9-bit data word is sufficient. Making a 9-bit table cannot be considered sufficient for two requirements. The first is that the coefficient value is represented as an unsigned value of 9 bits (or a symbolic value of 8 bits). The second condition is that the sum of the direct coefficients of the continuous coefficients is represented by the unsigned value of 9 bits (or the value of the 8-bit 兀). In the case of a typical low pass filter, such as those made in the examples of Figures 6a-c, since the most significant bit is reserved as a new value, the derivative value will have to be represented by only 8 bits. As a general rule / # right problem value is an unsigned value or can be used as an unsigned value, and then the difference between the direct coefficient of the right and the right is less than 5 〗 2 units, the derivative is close to 疋 尔method. Otherwise, the difference between consecutive direct coefficient values must be less than 256 units. ^

3019-9547-PF 28 200842699 。在此使 圖7係例示的直接濾波器700的編碼的列表 14係 用的程式語言係Forth,目標硬體處理器 SEAforth-24A 裝置。 " 項目702係使,,ίο”等同於i〇CS鲂六抑Γ 」、暫存is 50的編譯指 令。這將具體指定資料從何處讀取及被寫入至何處。注音, 在SEAf 〇rth-24A裝置中的I〇cs暫存器5〇可同時^體^定 讀取及寫人不同埠52。為了避免混淆,當讀取下面說明時, 熟悉較不精密的裝置者應該切記此點。 項目m係使” r #同於係數值的編譯指令。在 此’” 係僅是用以做為例子的—值。這將是在此處理 器14的c箱72中者。 項目706係在Forth語言中的註解。 項目708係在Forth語言中的位置標記。 項目710係初始化處理器14以接著進行摺積計算的一 序列的Forth指令。具體而言,1〇被載入至資料堆疊的頂 部;然後,從那裡被取出進入B暫存器4 6,使其指向聰 暫存器5L然後’ η被載人至資料堆疊的頂部;且一非運 具指令將填補在此使用# 18位元的指令字以包含此指令 序列。 項目712具體指定在其巾三個情況被條件編譯處理的 迴圈的起點。這取決於是否為管線中之第一(處理器ub)、 中間(處理器14c-v的任—個)、或最後(處理_ uw)而為 處理器14設計程式。也參閱圖3a-c。 員目714在此具體指定最典型情況之指令的編譯之起3019-9547-PF 28 200842699. Here, FIG. 7 is a schematic diagram of the coded language of the direct filter 700 exemplified by Forth, the target hardware processor SEAforth-24A device. " Item 702 is the same as, ίο "equivalent to i〇CS鲂六Γ Γ", temporary compilation is 50 compilation instructions. This will specify where the data is read from and where it is written. Noteless, the I〇cs register 5 in the SEAf 〇rth-24A device can be read and written at the same time. To avoid confusion, those familiar with less sophisticated devices should keep this in mind when reading the instructions below. The item m is such that the "r# is the same as the compile command of the coefficient value. Here" is only used as an example value. This will be in the c-box 72 of the processor 14. Item 706 is annotated in the Forth language. Item 708 is a location tag in the Forth language. Item 710 is a sequence of Forth instructions that initialize processor 14 to perform a convolution calculation. Specifically, 1〇 is loaded to the top of the data stack; then, from there, it is taken out into the B register 4 6 to point to the scratch register 5L and then 'n is carried to the top of the data stack; A non-tool instruction will fill the instruction word using #18 bits here to contain this sequence of instructions. Item 712 specifies the starting point of the loop in which the three conditions of the towel are conditionally compiled. This depends on whether the program is designed for the processor 14 for the first (processor ub), the middle (any one of the processors 14c-v), or the last (processing _uw) in the pipeline. See also Figures 3a-c. Member 714 here specifies the compilation of the most typical case

3019-9547-PF 29 200842699 點,即主處理器14係處理考 处理态14c-v中之一者的情況。 [注思’右相關的話,阁 一 圖7-8中的指令字右側的註解使 用表示資料堆疊及迴返堆疊 p且旧一仃条構,其中,向右的元 素係在個別的堆疊中之早^ 取上方。在下面說明中的附加說明 顯示這轉變成一前-後及南士 ― 便及向左的7〇素係最上方的架構,其係 在許多Forth原文書中遇至,丨的加 的木構。以此方式,二架構被 提供以使本例更容易瞭解。] 項目716係另-序列的Forth指令。具體而言,一資 料樣本值(s)係從B暫存器46指向處被讀取並且被推至資 料堆疊(h--sh)上;然後,一累積值(a)也被讀取並被推至 資料堆疊(sh—ash)上;然後,資料堆疊上之上方元素被取 出並且被推至迴返堆疊(1) ·· ash —— shR ·· —— a)的上部上;然 後,在為料堆豐上的上部元素被複製及推至資料堆疊⑺·· sh--ssh R : a —a)上。 接著是項目718,其中,資料堆疊上的上方元素被取 • 出並且被推至迴返堆疊(D ·· ssh--sh R : a--sa)的上部上; 然後,執行一大的相乘(“MULT” ,定義被提供在 SEAforth-24A裝置的BI0S中)。資料堆疊的上兩個元素在 此被用以做為乘數及被乘數,且當使被乘數就像資料堆疊 (D ·· sh —a’ h R : sa--sa)中的第二元素一樣時,上方元素 被結果(a ’ )取代。 接著是項目720,其中,在迴返堆疊上的上方元素被 取出並且被推至資料堆疊(D:a,h--sa,hR:sa —一3)上; 資料堆疊的上方元素被取出並被寫入至B暫存器46指向3019-9547-PF 29 200842699 Point, that is, the case where the main processor 14 processes one of the processing states 14c-v. [Notes on the right side, the annotation on the right side of the instruction word in Figure 7-8 shows the data stacking and returning stack p and the old one, where the elements to the right are in the early stack. ^ Take the top. The additional explanation in the following description shows that this is transformed into a top-back and south-south-left and left-left structure of the top, which is encountered in many Forth original texts. In this way, the second architecture is provided to make this example easier to understand. ] Item 716 is a separate-sequence Forth instruction. Specifically, a data sample value (s) is read from the B register 46 and pushed onto the data stack (h--sh); then, a cumulative value (a) is also read and Pushed onto the data stack (sh-ash); then, the upper element on the data stack is taken out and pushed back to the stack (1) ·· ash — shR ·· —— a) on the upper part; then, in The upper element on the material pile is copied and pushed onto the data stack (7)·· sh--ssh R : a — a). Next is item 718, in which the upper element on the data stack is taken out and pushed onto the upper portion of the return stack (D ·· ssh--sh R : a--sa); then, a large multiplication is performed ("MULT", the definition is provided in the BIOS of the SEAforth-24A device). The last two elements of the data stack are used here as multipliers and multiplicands, and when the multiplicand is made like the data stack (D ·· sh —a' h R : sa--sa) When the two elements are the same, the upper element is replaced by the result (a ' ). Next is item 720, in which the upper element on the return stack is taken and pushed onto the data stack (D: a, h--sa, hR: sa - one 3); the upper element of the data stack is taken out and Write to B register 46 pointing

3019-9547-PF 30 200842699 〇>:Sa’ hi’ hR:a —3)處;迴返堆疊的上方元素被取 出並士被推至資料堆疊(D:a,卜aa,…卜)上·且一 非運异指令填補被使用的18位元指令字。 接著是項目722’其中’資料堆7疊的上兩個元素被加 在一起’上方元素被新的累積和(a,,)取代且第二元辛被 下-㈣低的元素(D:aa,h — a,,卜r:—)取代。、3019-9547-PF 30 200842699 〇>:Sa' hi' hR:a —3); the upper element of the return stack is taken out and pushed onto the data stack (D:a, aa,...b) And a non-transfer instruction fills the 18-bit instruction word used. Next is the item 722' where the 'top two elements of the data stack 7 are added together' the upper element is replaced by the new accumulation and (a,,) and the second element is sub- (four) low (D:aa , h — a,, 卜 r: —). ,

接著是項目724,結束主處理器14係處理器ΐ4。—v中 之一者的情況之碼的條件編譯。 瞭解圖7中顯示的另兩個條件編譯現在應該很簡單。 因為沒有W的累積值要被讀取及相#,處理器⑽ 況較為單純。且因為現在的資料樣本值不需要被寫入至” 隨後的’,4理器’處理器14v的情況也多少較為簡單。 目726係對於所有處理器14bi被編譯 令序列’其中’資料堆疊的上方元素被 B暫存器46指向(D: a,,卜 被罵入至 到項目712。 …)處;且然後迴圈回 圖8a-b係例示的導數 圖8a顯示執行概念上類似 外)圖7的函數之編碼且圖 法使用的額外的編碼。 濾波器822的編碼列表,其中, 於(除了 9位元對18位元的計算 8b顯示由以導數為基礎的演算 如同可在圖8a中看到, 的直接濾波器700中相同。 其中,9個加號星號(‘‘+*,, (取代使用MULT定義被執行 在此許多碼實質上與上面討論 不過,有一個例外是項目802, )運算被用以執行一小的相乘 的大的相乘)。Following item 724, the main processor 14 is terminated by processor ΐ4. Conditional compilation of the code of the case of one of -v. Understanding the other two conditions shown in Figure 7 should now be simple. Since there is no cumulative value of W to be read and phase #, the processor (10) condition is relatively simple. And because the current data sample values do not need to be written to the "subsequent", the case of the processor 4v is relatively simple. The target system 726 is compiled for all processors 14bi to make the sequence 'where' data stacked. The upper element is pointed by the B register 46 (D: a, and is inserted into the item 712. ...) and then looped back to the derivative illustrated in Figures 8a-b. Figure 8a shows that the execution is conceptually similar) The encoding of the function of Figure 7 and the additional encoding used by the graphing. The encoding list of the filter 822, where (in addition to the 9-bit to 18-bit calculation 8b shows the derivative-based calculus as can be seen in Figure 8a Seen in the direct filter 700 is the same. Among them, 9 plus asterisks (''+*,, (instead of using the MULT definition is executed here, many of the codes are essentially discussed above, with one exception being item 802 , ) is used to perform a large multiplication of small multiplications).

3019-9547-PF 31 200842699 項目8 0 4係將迴返堆聂由 且中的部分值(ρ)初始化至零的 一序列的Forth指令。呈體而丄 /、體而g,在此,一常值被放在資 料堆疊(D : h--ph R :—)上;妙 /M _ . ;上,然後,從該處被推至迴返堆 豐(D· ph h R · — p)的頂邱 u . 4上,及兩個非運算指令被用以 填寫指令字。[注意’這個特別的方法被選擇以使得與直接 遽波器7GG的概纽較變成容易,且熟練的程式設計師瞭 解有更有效及確切的方式可用以處理此點]。 然後’項目806操作使用現在的部分值(P)之額外的加 法。現在的部分值(p)從迴返堆疊取出並且推至資料堆義 (D、,’ h—-pa,,hR:p—)上;第一個非運算指令取: 在下-指令之前的時間;資料堆疊的上兩個元素被加在一 起,其中,上方元素被下一部分值(p’ )及累積和(a,,) 的和取代,且第二元素被下一個較低的元素〇 : pa,h--P’ hR:—)取代;且第二個非運算指令填塞以位 元的指令字。 • 然後,項目808保留迴返堆疊中的累積和(a,,)以傲 為下一部分值(p,)。累積和(a,,)被複製(D ·· 3 h p a hR .);下一部分值(p,)從資料堆疊 被取出並且被推至迴返堆疊(D ·· a,,a, ,h r ·〜, 上,且二個非運算指令填滿18位元的指令字。 現在回到圖8b,這顯示當使用導數型演算法時被用 於”積分器”步驟的額外的碼。注意,在此特別的例子中, 此碼將在額外的處理器14中運行。 項目810係Forth語言中的註解且項目812係F〇r让3019-9547-PF 31 200842699 Item 8 0 4 is a sequence of Forth instructions that initializes the partial value (ρ) of the return pile to zero. Presented as 丄/, body and g, where a constant value is placed on the data stack (D: h--ph R :-); wonderful / M _ . ; on, and then pushed from there to On the top of the heap (D· ph h R · — p), the two non-computing instructions are used to fill in the instruction words. [Note] This particular method was chosen to make it easier to interface with the direct chopper 7GG, and the skilled programmer understands that there is a more efficient and precise way to handle this]. The 'item 806 operation then uses the additional addition of the current partial value (P). The current partial value (p) is taken from the return stack and pushed onto the data heap (D,, 'h--pa,, hR:p-); the first non-operational instruction takes: the time before the next-instruction; The upper two elements of the data stack are added together, wherein the upper element is replaced by the sum of the next part (p') and the cumulative sum (a, ,), and the second element is replaced by the next lower element: pa , h--P' hR: -) is substituted; and the second non-operational instruction is padded with the instruction word of the bit. • Item 808 then reserves the cumulative sum (a, ,) in the return stack to be proud of the next part of the value (p,). The cumulative sum (a,,) is copied (D ·· 3 hpa hR .); the next part of the value (p,) is taken from the data stack and pushed back to the stack (D ·· a,,,, hr ·~ , and the two non-operational instructions fill the 18-bit instruction word. Returning now to Figure 8b, this shows the extra code that is used for the "integrator" step when using the derivative-type algorithm. Note that here In the special case, this code will run on the extra processor 14. Item 810 is an annotation in the Forth language and item 812 is F〇r

3019-9547-PF 32 200842699 語言中的位置標記。經由將適當的編譯器指令加到圖8a中 的碼,此碼可被條件地編譯,或者其可被個別地編譯。 項目814係一序列的Forth指令,其首先將IQ的值給 B暫存器46,且其次將值$3F(埠位址)給a暫存器44。3019-9547-PF 32 200842699 Location marker in the language. This code can be conditionally compiled via the addition of appropriate compiler instructions to the code in Figure 8a, or it can be compiled separately. Item 814 is a sequence of Forth instructions that first pass the value of IQ to B register 46, and secondly to the value of $3F (埠 address) to a register 44.

項目816係另一序列的Forth指令。具體而言,係使 貧料堆疊歸零者。諸堆疊上的上方元素被複製並且被推 至貧料堆疊上,且然後這被再次完成(其與該上方元素是什 麼無關)。然後,二個最上方的元素從資料堆疊被取出,被 互斥或運算,且結果(零)被推回至資料堆疊上。 項目818具體指定迴圈的起點。 項目820係另一序列的Forth指令。具體而古,一播Item 816 is another sequence of Forth instructions. Specifically, the lean material stack is zeroed. The upper elements on the stacks are copied and pushed onto the lean stack, and then this is done again (which is independent of what the upper element is). Then, the two topmost elements are taken from the data stack, mutexed or computed, and the result (zero) is pushed back onto the data stack. Item 818 specifies the starting point of the loop. Item 820 is another sequence of Forth instructions. Specific and ancient, one broadcast

口 UL 攸B暫存器4 6指向處被讀取且被推至資料堆疊上;然後, 在貪料堆疊上的上二個元素被相加並取代上方元素(且第 二元素被下一個較低的元素取代);然後,資料堆疊上的上 方元素被複製並且被推至資料堆疊上;且然後上方元素從 資料堆疊被取出並且被寫入至A暫存器44指向處。這個: 淨結果係和被輸出,同時一複製品也為了迴圈的下一欠執 行被保存(累積)。 人 項目822係迴圈在那裡回到項目818處。 跟可被看到的一樣,本發明之摺積系統10使用的導數 型演算法需要非常少的額外碼。 雖然不同的實施例已說明如上,應瞭解其僅係以例子 被提出,本發明之寬度及範疇不應被任何上述例示的實施 例限定,而僅應根據下面的申請專利範圍及其均等物The port UL 攸B register 4 6 is read and pushed onto the data stack; then, the last two elements on the greedy stack are added and replaced by the upper element (and the second element is compared to the next element) The lower element is replaced); then, the upper element on the data stack is copied and pushed onto the data stack; and then the upper element is taken from the data stack and written to the A register 44 pointing. This: The net result is output and is also saved, and a copy is also saved (cumulative) for the next execution of the loop. Person Project 822 is looped back to Project 818 there. As can be seen, the derivative type algorithm used by the convolution system 10 of the present invention requires very few extra codes. Although the various embodiments have been described above, it is to be understood that the invention is not to be construed as limited by the scope of the embodiments of the invention described herein.

3019-9547-PF 33 200842699 義。 【圖式簡單說明】 圖1係描述被使用在電腦處理器的陣列中之發明的摺 積系統的圖式; 圖2 (習知技術)係圖1中的處理器之一,特別是在此 的許多例如中使用的IntellaSys corp〇rati〇n 〇f3019-9547-PF 33 200842699 Meaning. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a diagram depicting a deconvolution system of the invention used in an array of computer processors; FIG. 2 (technical technique) is one of the processors in FIG. 1, particularly here Many of the IntellaSys corp〇rati〇n 〇f used in

Cupertino, California 的 SEAf〇rth一24A 處理器中的核心 的主要内部特點的圖式; / 圖3a-c係描述使用圖丨中的處理器之向内 '向外、及 内部通信的部分圖式,其中圖? _ ^圖顯不貧料如何通過輸入 裝置及第一處理器之間鱼篥一虑审 、弟處理斋及第二處理器之間, 圖3b顯示資料如何通過第二 w 禾主敢後處理盗及一最後處理 器之間與最後處理器及輸出#罟 J ®装置之間,且圖3c顯示資料如 何通過被使用於陣财心的二例示的處理、 圖4 a - f係概述在諸如圖1去 計算級的方塊圖; 者之處理’的陣列中的摆積 係概述根據再次出現在諸如μ者之處理器的 列中之一新的演算法的摺積計算級的方塊圖; 圖广係描述使用出現在圖4a_f及圖5a_f中的二方 法執仃的摺積之圖式,1 传數的筮, /、 圖6a』不表示使用傳統摺積 錢的弟一圖形及表示使用導數摺積係數的第 6b顯示表示摺積據 y ^ 用導數^的輸入㈣之第—圖形及表示使 用¥數#唬貧料之第二圖形 口 bC颂不表不討論的方法A pattern of the main internal features of the core of the SEAF〇rth-24A processor in Cupertino, California; / Figure 3a-c is a partial diagram depicting the inward-outward, and internal communication of the processor in Figure 丨Which figure? _ ^ The picture is not poor, how to pass the input device and the first processor between the fisherman, the younger brother and the second processor, Figure 3b shows how the data through the second w And between the last processor and the final processor and the output #罟J ® device, and Figure 3c shows how the data is processed by the two examples used in the financial center, Figure 4 a - f is outlined in the figure 1 to the calculation level of the block diagram; the processing of the array of the distribution system is outlined according to the block diagram of the new calculation algorithm of the new algorithm reappearing in the column of the processor such as μ; Describe the pattern of the product using the two methods shown in Figures 4a-f and 5a-f. The 传 of the 1-pass number, /, Figure 6a 』 does not indicate the use of the traditional depreciation of the brother-one graph and the use of the derivative fold The 6b of the product coefficient shows the first graph of the input (4) of the derivative data y ^ and the second graph port bC of the poor material.

3019-9547-PF 34 200842699 之結果的單一圖形; 中, 8b顯 在圖式的不同圖中, 或類似的元件或步驟。 圖7係適於在直接濾波器中使用的編碼的列表.及 圖8a-b係適於在導數渡波器中使用的編碼的^表,其 圖8a顯示執行概念上類似於圖了的函數之編碼且圖 示由以導數為基礎的演算法使用的額外的編碼。 同樣的參考符號被用以標示同樣3019-9547-PF 34 200842699 The result of a single graph; in, 8b is shown in a different diagram of the schema, or a similar component or step. Figure 7 is a list of codes suitable for use in a direct filter. And Figures 8a-b are coded tables suitable for use in a derivative waver, and Figure 8a shows a function that is conceptually similar to the figure. The encoding is illustrated and illustrated by an additional encoding used by a derivative-based algorithm. The same reference symbol is used to indicate the same

【主要元件符號說明】 10 :摺積系統; 12 :陣列; 14 ·處理器; 1 6 ··外部輸入裝置; 18 :輸入匯流排; 20 :輸出匯流排;[Description of main component symbols] 10: Deconvolution system; 12: Array; 14 · Processor; 1 6 · External input device; 18: Input bus; 20: Output bus;

22 ·外部輪出裝置; 2 4 ·流動路徑; 2 6 ·晶粒; 28 :匯流排; 3Q: #術邏輯單元; 3 2 ’唯§買記憶體; 34 ·隨機存取記憶體; 36:指令解碼邏輯區段; 38 :指令區;22 · External wheeling device; 2 4 · Flow path; 2 6 · Grain; 28: Bus bar; 3Q: #术逻辑单元; 3 2 'only § buy memory; 34 · Random access memory; 36: Instruction decode logic section; 38: instruction area;

3019-9547-PF 35 200842699 40 : 資料堆疊 j 42 : 迴返堆疊 9 44、 46 、 48 、 50 :暫存 3S · σπ , 52 : 琿; 60 : 信號資料 元 件; 62 : 積分核滤 波 元件; 64 : 計算的元 件 66 : 結果元件 9 Ί2、 74 、 76 、 82 ' 84、 86 ^ 88 :箱; 92 > 92, 、94 94’ 、 96 : 圖形; 700 、822 :濾 波 器。 3019-9547-PF 363019-9547-PF 35 200842699 40 : Data stack j 42 : Return stack 9 44, 46, 48, 50: temporary storage 3S · σπ , 52 : 珲; 60 : signal data component; 62 : integral nuclear filter component; 64 : Calculated component 66: Resulting component 9 Ί2, 74, 76, 82 '84, 86^88: box; 92 > 92, 94 94', 96: graphics; 700, 822: filter. 3019-9547-PF 36

Claims (1)

200842699 十、申請專利範圍: 統,包括·· :十#貝料函數及濾波函數之摺積的系 處理器的陣列,包括第一及最後的該處 , 該處理器包括: ° τ 干次料λ2心將根據濾波函數的導數之—係數值及表 不貝科函數之一資料值相乘以產生一現在的中間值. 處理器之外的該處理器中,—邏輯,用以接 收一刖面的中間值,1 扞的4曾 z、表不在该處理器的另一個中先前執 ㈣“的中間值加到該現在的中間值;及 該取錢理n之外的該處理, 該資料值及兮《 AA 用將 " 、 間值送至另一個該處理器;及 值,若ΓΓ/用以保存來自該最後處理器的前面的中間 八值力s 士:話以做為一前面的部分值’及將該前面的部 :° “該最後處理器的該現在社 果值;及 ν'α 其中’該處理器的陣列接收一系列的該資料值並產生 擅積。的該結果值,其整體地表示資料函數及渡波函數的 如申請專利範圍第1項的系統’其中,該處理器平 盯地執行處理。 卞 Γ1牛L如申請專利範圍第1項的系統,其中,該處理器非 同步地執行處理。 F 4·如申請專利範圍第1項的系統,其中,該處理器非 3019-9547-PF 37 200842699 同步地互相通信。 •如申請專利範圍第卜2、3、或4項的系統,豆中, 倍數的該處理器佔用單—的該半導體晶粒。 理m如:明專利乾圍帛5項的系統,其中,所有的該處 理為的陣列佔用該半導體晶粒。200842699 X. Patent application scope: Array of processors, including: ##Batch function and filter function, including the first and last parts, the processor includes: ° τ dry material The λ2 heart is multiplied according to the derivative value of the filter function and the data value of one of the Bebe function to generate a current intermediate value. In the processor other than the processor, the logic is used to receive a 刖The median value of the face, 1 捍 4 z, the table is not in the other of the processor, the intermediate value of the previous (4) is added to the current intermediate value; and the processing other than the withdrawal of the money, the data Value and 兮 "AA uses the value of ", and the value is sent to another processor; and the value, if ΓΓ / is used to save the middle october force from the front of the last processor: words as a front The partial value 'and the previous part: ° "the current server value of the last processor; and ν'α where 'the array of processors receives a series of data values and produces tampering. The result value, which collectively represents the data function and the wave function, is the system of claim 1 wherein the processor performs the processing in a focused manner.卞 Γ 1 牛 L as in the system of claim 1, wherein the processor performs processing asynchronously. F4. The system of claim 1, wherein the processor is in communication with each other asynchronously 3019-9547-PF 37 200842699. • As in the system of claim 2, 3, or 4, in the bean, the processor of the multiple occupies the semiconductor die of the single. For example, the system of the patented dry cofferdam 5, wherein all of the arrays treated by this occupy the semiconductor die. 申請專利範圍第116項中任一項的系統,其 中’該處理H各自更包括—渡波儲存元件以保存該係數值。 8'如申請專利範圍第…項中任一項的系統,更包 括-邏輯’用以從系統的外部將該資料值接收進人該第一 處理器。 .如申請專利範圍第i至8項中任一項的系統,更包 括一邏輯’用以將該最後處理器的該結果值m统的外 部。 ίο.如申請專利範圍第】至9項中任一項的系統,其 中,保存該前面的該中間值之該邏輯係位於該最後處理器 中。 ° 11. 種用於计异在資料函數及濾、波函數的摺積中之 結果值的方法,該方法包括·· (a) 得到一序列的係數值,其係根據濾波函數的導數; (b) 對於表示資料函數的一資料值: (Ο對於各該係數值,在包括一第一及最後的該處理 态之電腦的處理器的管線中: (A )將該係數值及該資料值相乘以產生一現在的 中間值; 3019-9547-PF 38 200842699 (B) 除了在該第一處理器中,將表示在該處理器 的另個中先前執行的計算之前面的中間值加到該現在的 中間值;及 (C) 除了在該最後處理器中,將該資料值及該現 在的中間值送至一隨後的該處理器;及 (1 〇將一前面的部分值,若有的話,加到來自該最後 處理器之該現在的中間值以產生一結果值,其中,該前面 修的部分值係來自該最後處理器之一先前的中間值;及 (1 1 1 )將結果值輸出至利用此過程的一數位信號處理 器。 12. —種用於計算資料函數及濾波函數之摺積的方 法,該方法包括: (〇 =到一序列的係數值’其係根據濾波函數的導數; (b)彳于到一序列的表示資料函數的資料值: (C)對於各該資料值: 鲁 (1)對於各該係數值,在包括第—及最後的該處理器 之電腦的處理器的管線中: (A) 冑該係數值及該貧料值相乘以產生-現在的 中間值; j (B) 除了在該第一處理器中,將表示在該處理器 ^另―個中先前執行的計算之前面的中間值加到該現在的 中間值;及 (C) 除了在該最後處理器中,將該資料值及該現 的中間值送至一隨後的該處理器;及 3019-9547-PF 39 200842699 (i i)將一前面的部分值,若有 處理器之該現在的中間值以產生—結::到來自繼 的部分值係來自該最後處理器之前由其中’該刖面 无則的中間值; ⑷累積該(C)的該結果值以做為摺積;及 器。(e)將摺積輪出至利用此方法的一數位信號處理The system of any one of claims 116, wherein the processing H each further comprises a wave storage element to preserve the coefficient value. 8' The system of any one of the claims of the present invention, further comprising - logic 'for receiving the data value from the outside of the system into the first processor. A system as claimed in any one of claims 1 to 8 further comprising a logic ' external portion of the result value of the last processor. The system of any one of the preceding claims, wherein the logic for storing the previous intermediate value is located in the last processor. ° 11. A method for counting the result values in the data function and the convolution of the filter and wave functions, the method comprising: (a) obtaining a sequence of coefficient values based on the derivative of the filter function; b) for a data value representing the data function: (Ο For each of the coefficient values, in the pipeline of the processor of the computer including the first and last processing states: (A) the coefficient value and the data value Multiplying to produce a current intermediate value; 3019-9547-PF 38 200842699 (B) In addition to the first processor, the intermediate value before the calculation previously performed in the other of the processors is added The current intermediate value; and (C), except in the last processor, the data value and the current intermediate value are sent to a subsequent processor; and (1) a previous partial value, if any Adding the current intermediate value from the last processor to generate a result value, wherein the previously repaired partial value is from a previous intermediate value of one of the last processors; and (1 1 1 ) The resulting value is output to a digital signal using this process 12. A method for calculating a product of a data function and a filter function, the method comprising: (〇 = to a sequence of coefficient values 'based on the derivative of the filter function; (b) 彳 to one The data value of the sequence representation data function: (C) For each of the data values: Lu (1) For each of the coefficient values, in the pipeline of the processor including the first and last processors of the processor: (A)胄 the coefficient value and the lean value are multiplied to produce an intermediate value; j (B), except in the first processor, will represent the previous execution of the calculation in the processor The intermediate value is added to the current intermediate value; and (C) the data value and the current intermediate value are sent to a subsequent processor in addition to the last processor; and 3019-9547-PF 39 200842699 ( Ii) a previous partial value, if there is a current intermediate value of the processor to produce a - knot:: to the subsequent partial value from the last processor before the intermediate value of the 'the face is not; (4) accumulating the result value of (C) as a product; (e) rounding the fold to a digital signal processing using this method 13·如申請專利範圍第12項的方法, (c)(1)(A)係在複數該處理器中對於複數該資料 執行0 、, 其中,該 值同時地 14·如申請專利範圍第12項的方法,其中,該 (c)⑴⑴係在複數該處理旨巾料該序㈣絲值同時 地執行。 1 5· —種用於計算摺積的系統,其中: 至少一處理器,將表示一濾波函數的一係數值乘上表 示一資料函數的一資料值;13. The method of claim 12, (c) (1) (A) is to perform 0 for a plurality of the data in the plurality of processors, wherein the value is simultaneously 14 The method of the present invention, wherein the (c) (1) (1) is performed simultaneously in the plurality of the processing materials. 1 5 - a system for calculating a product, wherein: at least one processor multiplies a coefficient value representing a filter function by a data value representing a data function; 其中: 該係數值係根據濾波函數的導數。 16·如申請專利範圍第15項的系統,更包括複數該處 理益’其將多個該係數值平行地乘上多個該資料值。 17· —種用於在電腦的處理器中計算摺積的方法,其 中: 係數值係表示一濾波函數; 資料值係表示一資料函數;及 將該係數值及該資料值相乘以產生結果值,其係整體 3019-9547-PF 40 200842699 地表示摺積;及 其中: 該係數值係根據濾波函數的一導數。 處理1|8中Z請專利範圍第17項的方法,更包括在複數該 19 、;複數該資料值同時執行該相乘。 …種信號處理器’包括如申請專利範圍第1至10 一、广,I6項中任一項的系統,及用於從-信號提供表 不#唬的該資料值之裝置。 .2〇· 一種處理信號的方法,包括從一俨號導出表亍仁 號的資料值並且新撼“宙 ㈣U ¥出表不^ 波 中任-項處;=申晴專利範圍第11至14、17,項 位滤L1器如中請專利範圍第19項的信號處理器,其係—數 申叫專利乾圍第20 $的方法,其將-信號濾 23 一種電腦程式,♦ + ^ B4# r 田在電腦的處理器之陣列中運行 牯使侍陣列執行申請專利 哥订 -項的方法。 關m至14、2G、22項中任 體 25如:載體’载有如申請專利範圍第23項的程式。 在其上記錄程式 24項的栽體,其係一記錄媒 3019-9547-PF 41Where: the coefficient value is based on the derivative of the filter function. 16. The system of claim 15, wherein the system further comprises a plurality of the coefficient values being multiplied in parallel by the plurality of data values. 17. A method for calculating a product in a processor of a computer, wherein: the coefficient value represents a filter function; the data value represents a data function; and the coefficient value and the data value are multiplied to produce a result The value, which is the whole 3019-9547-PF 40 200842699, represents the fold; and wherein: the coefficient value is based on a derivative of the filter function. The method of processing the clause 17 of the patent scope of 1|8 is further included in the plural number 19; and the plurality of data values are simultaneously performed. The signal processor 'includes a system as claimed in any one of claims 1 to 10, broad, and I6, and means for providing the data value of the slave signal. .2〇· A method for processing signals, including deriving the data value of the 亍仁号 from a nickname and the new 撼 宙 ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( 17, the item filter L1 device, such as the signal processor of the 19th patent scope, which is the method of applying for the patent 20% of the patent, and the signal-filtering 23 computer program, ♦ + ^ B4 # r Field runs in the array of computer processors to enable the array to execute the patent application--method. Close m to 14, 2G, 22 of any of the 25 such as: the carrier 'as contained in the patent scope 23 The program of the item is recorded on the program of 24 items, which is a recording medium 3019-9547-PF 41
TW97110966A 2007-04-06 2008-03-27 Signal processing TW200842699A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US91062607P 2007-04-06 2007-04-06

Publications (1)

Publication Number Publication Date
TW200842699A true TW200842699A (en) 2008-11-01

Family

ID=39827918

Family Applications (1)

Application Number Title Priority Date Filing Date
TW97110966A TW200842699A (en) 2007-04-06 2008-03-27 Signal processing

Country Status (4)

Country Link
JP (1) JP2009010925A (en)
CN (1) CN101652770A (en)
TW (1) TW200842699A (en)
WO (1) WO2008124061A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10546211B2 (en) 2016-07-01 2020-01-28 Google Llc Convolutional neural network on programmable two dimensional image processor
US10122346B2 (en) * 2017-03-03 2018-11-06 Synaptics Incorporated Coefficient generation for digital filters

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6567564B1 (en) * 1996-04-17 2003-05-20 Sarnoff Corporation Pipelined pyramid processor for image processing systems
US7158141B2 (en) * 2002-01-17 2007-01-02 University Of Washington Programmable 3D graphics pipeline for multimedia applications
JP2005217837A (en) * 2004-01-30 2005-08-11 Sony Corp Sampling rate conversion apparatus and method thereof, and audio apparatus

Also Published As

Publication number Publication date
CN101652770A (en) 2010-02-17
WO2008124061A1 (en) 2008-10-16
JP2009010925A (en) 2009-01-15

Similar Documents

Publication Publication Date Title
CN103999078B (en) Vector processor having instruction set with vector convolution funciton for FIR filtering
CN109062611A (en) Processing with Neural Network device and its method for executing vector scaling instruction
TW393622B (en) Multiple execution unit dispatch with iInstruction dependency
US20060269054A1 (en) Extending the range of computational fields of intergers
US20070028076A1 (en) Algebraic single instruction multiple data processing
Meher Systolic designs for DCT using a low-complexity concurrent convolutional formulation
JPH07507411A (en) Digital filter using coefficients that are powers of 2
Satoh et al. Fast computation of canonical lifts of elliptic curves and its application to point counting
JP2004514960A (en) Method and apparatus for performing calculation using remainder operation
TW530260B (en) Arithmetic circuit and arithmetic method
US9069686B2 (en) Digital signal processor having instruction set with one or more non-linear functions using reduced look-up table with exponentially varying step-size
Mohan et al. Specialized residue number systems
TW200842699A (en) Signal processing
TW528986B (en) Method and apparatus for calculating a reciprocal
Mehendale et al. Area-delay tradeoff in distributed arithmetic based implementation of FIR filters
US20040128335A1 (en) Fast fourier transform (FFT) butterfly calculations in two cycles
KR20080091049A (en) Signal processing
US9176735B2 (en) Digital signal processor having instruction set with one or more non-linear complex functions
WO2022252876A1 (en) A hardware architecture for memory organization for fully homomorphic encryption
EP3480710A1 (en) Computer architectures and instructions for multiplication
Li et al. Efficient Nonrecursive Bit‐Parallel Karatsuba Multiplier for a Special Class of Trinomials
Cintra et al. The arithmetic cosine transform: Exact and approximate algorithms
Masuda et al. FFT program generation for ring LWE-based cryptography
US20100138463A1 (en) Digital Signal Processor Having Instruction Set With One Or More Non-Linear Functions Using Reduced Look-Up Table
Boussakta et al. Prime-factor Hartley and Hartley-like transform calculation using transversal filter-type structures