WO2018054104A1 - 一种深存储器以及测量仪器 - Google Patents

一种深存储器以及测量仪器 Download PDF

Info

Publication number
WO2018054104A1
WO2018054104A1 PCT/CN2017/088649 CN2017088649W WO2018054104A1 WO 2018054104 A1 WO2018054104 A1 WO 2018054104A1 CN 2017088649 W CN2017088649 W CN 2017088649W WO 2018054104 A1 WO2018054104 A1 WO 2018054104A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
fpga
sub
data
deep
Prior art date
Application number
PCT/CN2017/088649
Other languages
English (en)
French (fr)
Inventor
周立功
Original Assignee
广州致远电子有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州致远电子有限公司 filed Critical 广州致远电子有限公司
Publication of WO2018054104A1 publication Critical patent/WO2018054104A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0688Non-volatile semiconductor memory arrays

Definitions

  • Memory is a key component of electronic measuring instruments. Faced with the era of big data, the requirements of measuring instruments for memory capacity, bandwidth and processing performance are also increasing.
  • the storage capacity of deep memory of general-purpose measuring instruments such as oscilloscopes and logic analyzers is usually very low.
  • the present invention provides a deep memory and a measuring instrument to solve the problems of small storage capacity, small bandwidth, and low processing performance of the general measuring instrument in the prior art.
  • a deep memory comprising: at least one memory, the memory comprising at least two sub-memory, each of the sub- memories comprising a field programmable logic array FPGA and a memory unit connected to the FPGA, the at least two sub- memories FPGA cascading of memory.
  • the FPGA is an FPGA with a logic resource of more than 160,000.
  • the storage unit comprises a plurality of DDR3 particles in parallel, and the buses of the plurality of DDR3s are combined and connected to the FPGA to implement input and output of data.
  • the number of the DDR3 particles is 4 or 8.
  • the number of the memory is one, and the memory includes a first sub-memory and a second sub-memory;
  • the first sub-memory includes a first FPGA and a first storage unit connected to the first FPGA;
  • the second sub-memory includes a second FPGA and a second storage connected to the second FPGA Storage unit
  • the first FPGA is cascaded with the second FPGA, and data is input by the first FPGA and output by the second FPGA.
  • the number of the memory is two, the memory includes a first sub-memory, a second sub-memory, a third sub-memory, and a fourth sub-memory;
  • the first sub-memory includes a first FPGA and a first storage unit connected to the first FPGA;
  • the second sub-memory includes a second FPGA and a second storage unit connected to the second FPGA;
  • the third sub-memory includes a third FPGA and a third storage unit connected to the third FPGA;
  • the fourth sub-memory includes a fourth FPGA and a fourth storage unit connected to the fourth FPGA;
  • the first FPGA is cascaded with the second FPGA, and data is input by the first FPGA and output by the second FPGA;
  • the third FPGA is cascaded with the fourth FPGA, and data is input by the third FPGA and output by the fourth FPGA.
  • the deep memory further includes a fifth FPGA, and an input end of the fifth FPGA is respectively connected to an output end of the second FPGA and an output end of the fourth FPGA, and data is output by the fifth FPGA.
  • the invention also provides a measuring instrument comprising the deep memory described above.
  • the measuring instrument is an oscilloscope or a logic analyzer.
  • the deep memory provided by the present invention includes at least one memory, and the memory includes at least two sub-memory, each of the sub-memory including an FPGA (Field-Programmable Gate Array) and A memory cell connected to the FPGA, the FPGA of the at least two sub-memory cascades. That is, the deep memory provided by the present invention includes a plurality of memory cells, the memory cells are increased, and the total capacity and total bandwidth thereof are increased. Since the FPGAs of the two sub-memory are cascaded, when processing the same unit of data, the two FPGAs can be divided into two FPGAs. Simultaneous data processing, equivalent to processing the data in the pipeline, thus processing the total data The time is reduced, the processing speed is increased, and the processing performance is improved.
  • FPGA Field-Programmable Gate Array
  • the measuring instrument provided by the present invention includes the deep memory described above, and the performance of the measuring instrument is better due to an increase in storage capacity, an increase in bandwidth, and an improvement in processing performance of the deep memory.
  • FIG. 1 is a schematic structural diagram of a deep memory according to an embodiment of the present invention.
  • Figure 2 shows the data processing architecture of a conventional FPGA
  • FIG. 3 is a schematic diagram of a data sampling process of a conventional FPGA
  • FIG. 4 is a data processing architecture of a large-scale FPGA according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a data sampling process of a large-scale FPGA according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a storage unit according to an embodiment of the present disclosure.
  • FIG. 7 is a schematic structural diagram of another deep memory according to an embodiment of the present invention.
  • an oscilloscope or other measuring instrument such as a logic analyzer requires a deep memory with large storage capacity, high bandwidth, and high processing performance.
  • An embodiment of the present invention provides a deep memory including at least one memory, the memory including at least two sub-memory, each of the sub-memory including an FPGA and a storage unit connected to the FPGA, the at least two sub-memory FPGA cascading. It should be noted that each sub-memory is a separate unit capable of processing the data input thereto by itself. And the connected sub-memory can share data through the FPGA interconnect.
  • the number of memory cells is increased, which increases the total memory capacity of the deep memory.
  • the bandwidth of a storage unit is p'
  • the number of the added sub-memory is not limited.
  • the number of the sub-memory is two or four, and the structure diagram of the two sub-memory as shown in FIG. 1 is known in the art. It can be understood by the person, of course, in other cases, the number of cascaded sub-memory can continue to be increased, depending on the storage capacity requirement of the deep memory.
  • the FPGA preferably uses an FPGA with a logic resource of more than 160,000. More preferably, both the FPGA 1 and the FPGA 2 adopt an XC7K160T type FPGA, and the logic resource can reach 160,000, and the processing speed can be further. Accelerated, and its data processing architecture changes with respect to the FPGA in the prior art, thereby improving the processing performance.
  • FPGA resources can also be applied to the implementation of more than 160,000 FPGAs. In this example, the embodiment does not limit this.
  • the usual data flow is: According to the "storage unit controller" read by the “storage unit”, after the "bit width conversion” module to do the speed reduction processing, send "data sampling” for processing, and then sent to each "function module".
  • Traditional solutions are often limited by resource or design difficulty, and the usual sampling modules and bit width conversion modules are separate. At lower bus bit widths (L1), there is less processing resources (usually 32 or 64 bits), which is easy to implement, but takes a long time to process.
  • Tsystem is the system clock cycle.
  • the bit width conversion module first converts the bit width of the bus width which is input to the bit width conversion unit to L2 into the bus width L1 which the data sampling module can use, and then samples the data by the data sampling module. After one scan, the sampled data D4 and D8 are obtained.
  • the data processing architecture provided in this embodiment utilizes the advantages of large-scale FPGA resources, and performs parallel acceleration processing, and combines the "bit width conversion” and “data sampling” modules into a “quick sampling module” in the physical structure, and data processing.
  • the architecture diagram is shown in Figure 4.
  • the sampling process is shown in Figure 5, and the bit width conversion module is combined with the data sampling module.
  • the high bus width of L2 is sampled by the fast sampling module to directly sample the data D4 and D8 from parallel processing in the data width of the input to the fast sampling module.
  • the processing performance of the high-performance data processing architecture is L2/L1 times that of the traditional data processing architecture.
  • L2 is 512 and L1 is 64, which is 8 times better than the traditional solution.
  • L1 is 32, and the performance of the processed data in this embodiment is improved by 16 times compared with the case where L1 is 32.
  • the storage unit includes a plurality of dynamic memories, including a plurality of DDR3s.
  • the number of the DDR3 particles is preferably 4 or 8. According to actual requirements, when the processing capability of the FPGA reaches a higher level, in other embodiments of the present invention, the number of the DDR3 particles may also be in accordance with the demand.
  • the processing capability of the FPGA is increased, and the number of the DDR3 particles is not limited in this embodiment.
  • the DDR3 described in this embodiment is a computer memory specification. It belongs to the SDRAM (Synchronous Dynamic Random Access Memory) family of memory products, providing higher operating efficiency and lower voltage than DDR2 SDRAM. It is the successor of DDR2SDRAM and is also the current popular memory product specification. .
  • the structure of the storage unit is shown in Figure 6. It consists of n DDR3 particles in parallel. After the bus is merged, it is connected to the FPGA to realize the input and output of data.
  • the advantage of using the parallel structure is that it can increase the storage capacity by n times and increase the storage bandwidth by n times.
  • the total bandwidth of the deep memory provided by this embodiment is:
  • N represents the number of sub-memory of the deep memory
  • p represents the bandwidth of a single dynamic memory
  • n represents the number of dynamic memories in the memory cell (n is typically 4 or 8).
  • D represents the total memory capacity
  • N represents the number of sub-memory of the deep memory
  • d represents the size of the capacity of a single dynamic memory
  • n represents the number of dynamic memories in the memory unit.
  • the bandwidth of dynamic memory is calculated as follows:
  • p represents a single dynamic memory bandwidth
  • w represents the data bus width of a single dynamic memory
  • r represents the data rate of the dynamic memory.
  • Bps bits per second is the unit of bandwidth and represents the bit rate per unit time.
  • the dynamic memory in this embodiment preferably uses DDR3 particles, the data bus width is 16 bits, the data frequency is 1600 MHz, the capacity is 2 Gbits, and the number n of dynamic memory of each memory cell is 4; according to formula (3) and formula (5) Calculate the total bandwidth of the memory as:
  • the total storage capacity can be calculated as:
  • the deep memory provided in this embodiment increases the number of memory cells by increasing the number of memory cells, thereby increasing the total memory capacity and total bandwidth of the deep memory. Since the added sub-memory is cascaded with the existing sub-memory, in the process of processing data, it is equivalent to pipeline processing, and the multi-level sub-FPGA simultaneously processes the data, thereby improving the speed of data processing, thereby improving the speed. Deep memory processing performance.
  • the FPGA is preferably a large-scale FPGA, which further improves the processing performance of the deep memory. Since the FPGA is a large-scale FPGA, the number of DDR3 particles in the memory unit can be increased, further increasing the deep memory. The total storage capacity and total bandwidth solve the problems of small capacity, low bandwidth and poor processing performance of existing measuring instruments.
  • the deep memory provided in this embodiment, as shown in FIG. 1, the number of memories is one, the memory includes a first sub-memory and a second sub-memory; the first sub-memory includes a first FPGA (ie, FPGA1 shown in the figure) a first storage unit connected to the FPGA 1 (ie, the storage unit 1 shown in the figure); the second sub-memory includes a second FPGA 2 (ie, the FPGA 2 shown in the figure) and a second storage unit connected to the FPGA 2 (ie, in the figure) The memory cell 2) shown; FPGA1 is cascaded with FPGA2, data is input by FPGA1, and output by FPGA2.
  • the first sub-memory includes a first FPGA (ie, FPGA1 shown in the figure) a first storage unit connected to the FPGA 1 (ie, the storage unit 1 shown in the figure);
  • the second sub-memory includes a second FPGA 2 (ie, the FPGA 2 shown in the figure) and a second storage unit
  • both FPGA1 and FPGA2 adopt the XC7K160T model FPGA. Skill The domain technician can understand that other types of FPGA resources can also be applied to the FPGA in the embodiment of the present invention.
  • the embodiment does not limit this.
  • the storage unit 1 and the storage unit 2 each include a plurality of DDR3 particles in parallel, and the buses of the plurality of DDR3s are combined and connected to the FPGA to implement data input and output.
  • the number of DDR3 particles is 4, the data bus width is 16 bits, the data frequency is 1600 MHz, and the capacity is 2 Gbits.
  • the total bandwidth of the deep memory is:
  • the total storage capacity of the deep memory is:
  • the FPGA is small-scale, the resources are limited, and the storage unit can only be configured with one DDR3 particle.
  • the total bandwidth and the total storage capacity of the deep memory provided in this embodiment are all in the conventional technology. 8 times the processing performance is L2/L1 times the processing performance of the conventional technology memory.
  • the embodiment provides a deep memory. As shown in FIG. 7, the number of memories is two.
  • the memory includes a first sub-memory, a second sub-memory, a third sub-memory, and a fourth sub-memory;
  • the first sub-memory includes An FPGA (ie, FPGA 1 shown in the figure) and a first storage unit (ie, storage unit 1 shown in the figure) connected to the FPGA 1;
  • the second sub-memory includes a second FPGA (ie, FPGA 2 shown in the figure) and a second storage unit connected to the FPGA 2 (ie, the storage unit 2 shown in the figure);
  • the third sub-memory includes a third FPGA (ie, the FPGA 3 shown in the figure) and a third storage unit connected to the FPGA 3 (ie, in the figure)
  • the fourth sub-memory includes a fourth FPGA (ie, FPGA 4 shown in the figure) and a fourth storage unit
  • the deep memory in this embodiment is composed of four sub-memory, which are combined to form one memory, and then connected in parallel to form two parallel data processing communication structures, and each sub-memory is an independent module, and the data processing process is not subject to other sub-processes.
  • the impact of memory reflects the nature of parallel processing.
  • FPGA1, FPGA2, FPGA3, and FPGA4 all adopt large-scale FPGAs, and preferably adopt FPGAs of the XC7K160T model. It can be understood by those skilled in the art that FPGAs of other types of logic resources of more than 160,000 can also be applied to the present embodiment, which is not limited in this embodiment.
  • the storage unit 1, the storage unit 2, the storage unit 3, and the storage unit 4 each include a plurality of DDR3 particles in parallel, and a plurality of DDR3 buses are combined and connected to the FPGA to implement data input and output.
  • the number of DDR3 particles in the memory unit 1, the memory unit 2, the memory unit 3, and the memory unit 4 is 4, the data bus width is 16 bits, the data frequency is 1600 MHz, and the capacity is 2 Gbits.
  • the total storage capacity of the deep memory provided in this embodiment is:
  • the FPGA is small-scale, the resources are limited, and the storage unit can only be configured with one DDR3 particle.
  • the total bandwidth and the total storage capacity of the deep memory provided in this embodiment are all in the conventional technology. 16 times the processing performance is L2/L1 times the processing performance of the conventional technology memory.
  • the fifth FPGA (ie, the FPGA 5 shown in the figure) is further included in the embodiment, and the data is input and processed by the FPGA1 and the FPGA3, and then sent separately. Processing to FPGA2 and FPGA4, and finally summarizing to FPGA5 to complete data processing and output.
  • the FPGA 5 also preferably adopts an FPGA of the XC7K160T model, thereby improving the processing performance of the deep memory.
  • An embodiment of the present invention provides a measuring instrument, including a deep memory, which is a deep memory provided in Embodiment 1, Embodiment 2 or Embodiment 3, and includes at least one memory, the memory including at least two sub-memory, each The sub-memory includes an FPGA and a memory unit connected to the FPGA, and the FPGA of the at least two sub-memory is cascaded.
  • the FPGAs involved are all XC7K160T FPGAs with 160,000 logic resources, and the processing speed can be further accelerated.
  • the data processing architecture is compared to the FPGA in the prior art. Changes have occurred, which in turn has improved processing performance.
  • each of the storage units preferably includes a plurality of DDR3 particles, wherein the number of DDR3 particles is not limited.
  • the specific form of the measuring instrument is not limited in this embodiment, as long as the internal includes a deep memory, and the preferred measuring instrument is an oscilloscope or a logic analyzer.
  • the large-scale FPGA device and the deep memory structure provided by the oscilloscope can effectively solve the problems of small capacity, low bandwidth and poor performance of the traditional instrument storage solution.
  • the storage depth is up to 512Mpts, and the processing performance is high, so that the oscilloscope does not lose waveform details when observing long-term waveforms. It solves a big problem in the traditional deep storage technology of universal measuring instruments.
  • the logic analyzer adopts a large-scale FPGA, a plurality of DDR3 particles, and the deep memory structure provided by the present invention, and can effectively solve the problems of small capacity, low bandwidth, and poor performance of the traditional instrument storage solution.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)
  • Image Processing (AREA)

Abstract

一种深存储器和测量仪器,所述深存储器包括至少一个存储器,所述存储器包括至少两个子存储器,每个所述子存储器包括一个FPGA和一个与所述FPGA相连的存储单元,所述至少两个子存储器的FPGA级联。所述深存储器包括多个存储单元,存储单元增加,其总容量及总带宽增加;由于所述两个子存储器的FPGA级联,在处理相同单位的数据时,可以分在两个FPGA同时进行数据处理,相当于流水线处理数据,从而处理数据的总时间减少,处理速度增加,处理性能提高。

Description

一种深存储器以及测量仪器 技术领域
本申请要求于2016年9月26日提交中国专利局、申请号为201610851332.X、发明名称为“一种深存储器以及测量仪器”的国内申请的优先权,其全部内容通过引用结合在本申请中。
背景技术
存储器是电子测量仪器的关键组成单元,面临大数据时代,测量仪器对存储器的容量、带宽大小及处理性能的要求也日渐提升。
现有的示波器、逻辑分析仪等通用测量仪器的深存储器的存储容量通常都很低,通常存储容量小于10Mpts(8位的示波器一个采样点1pt=8bits),仅仅能够存储10兆采样点;而且深存储器的处理速度也非常慢,10Mpts大小的数据需要处理很久。
发明内容
有鉴于此,本发明提供一种深存储器及测量仪器,以解决现有技术中通用测量仪器的存储容量小、带宽小及处理性能较低的问题。
基于上述目的,本发明提供的技术方案具体如下:
一种深存储器,包括:至少一个存储器,所述存储器包括至少两个子存储器,每个所述子存储器包括一个现场可编程逻辑阵列FPGA和一个与所述FPGA相连的存储单元,所述至少两个子存储器的FPGA级联。
优选地,所述FPGA为逻辑资源在16万以上的FPGA。
优选地,所述存储单元包括并行的多个DDR3颗粒,多个DDR3的总线合并后与FPGA相连,实现数据的输入输出。
优选地,所述DDR3颗粒的个数为4或8。
优选地,所述存储器个数为一个,所述存储器包括第一子存储器和第二子存储器;
所述第一子存储器包括第一FPGA和与所述第一FPGA相连的第一存储单元;
所述第二子存储器包括第二FPGA和与所述第二FPGA相连的第二存 储单元;
所述第一FPGA与所述第二FPGA级联,数据由所述第一FPGA输入,由所述第二FPGA输出。
优选地,所述存储器个数为两个,所述存储器包括第一子存储器、第二子存储器、第三子存储器和第四子存储器;
所述第一子存储器包括第一FPGA和与所述第一FPGA相连的第一存储单元;
所述第二子存储器包括第二FPGA和与所述第二FPGA相连的第二存储单元;
所述第三子存储器包括第三FPGA和与所述第三FPGA相连的第三存储单元;
所述第四子存储器包括第四FPGA和与所述第四FPGA相连的第四存储单元;
所述第一FPGA与所述第二FPGA级联,数据由所述第一FPGA输入,由所述第二FPGA输出;
所述第三FPGA与所述第四FPGA级联,数据由所述第三FPGA输入,由所述第四FPGA输出。
优选地,所述深存储器还包括第五FPGA,所述第五FPGA的输入端分别连接所述第二FPGA的输出端和所述第四FPGA的输出端,数据由所述第五FPGA输出。
本发明还提供一种测量仪器,包括上面所述的深存储器。
优选地,所述测量仪器为示波器或逻辑分析仪。
经由上述技术方案可知,本发明提供的深存储器,包括至少一个存储器,所述存储器包括至少两个子存储器,每个所述子存储器包括一个FPGA(Field-Programmable Gate Array,现场可编程逻辑阵列)和一个与所述FPGA相连的存储单元,所述至少两个子存储器的FPGA级联。即本发明提供的深存储器包括多个存储单元,存储单元增加,其总容量及总带宽增加;由于所述两个子存储器的FPGA级联,在处理相同单位的数据时,可以分在两个FPGA同时进行数据处理,相当于流水线处理数据,从而处理数据的总 时间减少,处理速度增加,处理性能提高。
本发明提供的测量仪器包括上面所述的深存储器,由于所述深存储器的存储容量增加、带宽增加、处理性能提高,因此所述测量仪器的性能更好。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。
图1为本发明实施例提供的一种深存储器结构示意图;
图2为传统FPGA的数据处理架构;
图3为传统FPGA的数据抽样过程示意图;
图4为本发明实施例提供的大规模FPGA的数据处理架构;
图5为本发明实施例提供的大规模FPGA的数据抽样过程示意图;
图6为本发明实施例提供的存储单元结构示意图;
图7为为本发明实施例提供的另一种深存储器结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
发明人在实践过程中发现,如果示波器的存储深度能够做到足够大,就能保持最高的波形采样率,则观察到的波形会更加真实、细腻。反之,如果存储深度只有1Mpts甚至更低,当要观察时长较大的波形时,示波器就只能被迫降低采样率。由于采样点数不足,示波器显示的波形也会严重失 真,甚至产生波形混叠,误导用户测量分析。
因此,示波器或其他测量仪器如逻辑分析仪等,均需要存储容量大、带宽高和处理性能高的深存储器。
实施例一
本发明实施例提供一种深存储器,包括至少一个存储器,所述存储器包括至少两个子存储器,每个所述子存储器包括一个FPGA和一个与所述FPGA相连的存储单元,所述至少两个子存储器的FPGA级联。需要说明的是,每个子存储器都是独立的单元,能够独自处理输入到其内的数据。且相连的子存储器可以通过FPGA互连共享数据。
假设一个存储单元的存储容量为d′,本实施例中提供的深存储器的子存储器的个数为N,则本实施例中提供的深存储器的总存储容量D=N*d′,可见,增加了存储单元的个数,能够增加深存储器的总存储容量。同理,假设一个存储单元的带宽为p′,则本实施例中提供的深存储器的总带宽为P=N*p′,可见,深存储器的总带宽与存储单元的数量有关,当存储单元的数量增加时,总带宽也随之增加。而串联了新的FPGA后,多个FPGA级联,多级FPGA同时对数据进行处理,相当于流水线处理数据,从而相对于一个FPGA处理数据的时间来说,多级FPGA同时处理能够减少处理时间,进而提高数据处理速度,也即提高了深存储器的处理性能。
因此,本实施例中不限定增加的子存储器的个数,优选地,所述子存储器的个数为2个或4个,如图1所示为2个子存储器时的结构图,本领域技术人员可以理解的,当然在其他情况下还可以继续增加级联的子存储器的个数,具体根据所述深存储器的存储容量需求而定。
需要说明的是,本实施例中所述FPGA优选采用逻辑资源在16万以上的FPGA,更为优选的,FPGA1和FPGA2均采用XC7K160T型号的FPGA,其逻辑资源能够达到16万,处理速度能够进一步加快,且其数据处理架构相对于现有技术中的FPGA发生变化,进而提升了处理性能,本领域技术人员能够理解的,其他型号的逻辑资源在16万以上的FPGA同样也能应用于本实施例中,本实施例对此不做限定。
如图2所示,为传统FPGA的数据处理架构,通常的数据流程为:数 据经“存储单元控制器”由“存储单元”读取,经过“位宽转换”模块做降速处理,送“数据抽样”进行处理后,再送到各“功能模块”。设“数据抽样”的总线位宽为L1(通常很小,如32),存储单元控制器的输出总线位宽为L2(通常远大于L1,如512)。传统方案通常受限于资源或设计难度,通常的抽样模块和位宽转换模块是分开的。在较低的总线位宽(L1)下,处理资源少(通常为32或64位),容易现实,但是处理时间长。
假设需要处理的数据总量为M(bits),则传统的数据处理架构需要处理的时间为:
Figure PCTCN2017088649-appb-000001
其中Tsystem为系统时钟周期。
如图3为4倍抽样过程,位宽转换模块首先将输入到位宽转换单元的总线位宽为L2的位宽转换为数据抽样模块能够使用的总线位宽L1,然后经过数据抽样模块的出行抽样,一个一个扫描后,得到抽样数据D4和D8。本实施例中L2=32;L1=8;4倍等间隔抽样处理,8个数据需要8个系统时钟周期才能处理完。
本实施例提供的数据处理架构,利用大规模FPGA资源多的优点,做了并行加速处理,在物理结构上将“位宽转换”和“数据抽样”模块结合组成“快速抽样模块”,数据处理架构图如图4所示。
则本实施例中提供的大规模FPGA的数据处理架构需要处理的时间为:
Figure PCTCN2017088649-appb-000002
由于采用了大规模的FPGA,具有足够的资源,涉及复杂度大增,但能够大大提高处理速度,同样的数据,抽样过程如图5所示,将位宽转换模块与数据抽样模块合并,使用了L2的高总线位宽,其抽样过程为快速抽样模块从输入到快速抽样模块的总线位宽为L2的数据中并行处理直接抽样得到数据D4和D8。本实施例中,L2=32;L1=8;4倍等间隔抽样处理,8个数据需要2个系统时钟周期才能处理完。
对比公式(1)和公式(2),可得高性能数据处理架构的处理性能是传统数据处理架构的L2/L1倍。而在实际应用中,L2为512,L1为64,相比传统方案性能提升了8倍。传统方案中由于FPGA的资源较少,通常情况下,L1为32,相对于L1为32的情况,本实施例中的处理数据性能提高了16倍。所有的系统中,基本上都会有数据抽样处理的操作,包括峰值抽样、等间隔抽样等,抽样倍率通常为任意整数。因此,本实施例中的L2和L1数据不仅限于此,在此不再进行赘述。
由于FPGA为采用大规模的FPGA,其处理速度加快,处理数据性能提升,因此存储单元的颗粒能够相应增加,本实施例中优选的,所述存储单元包括多个动态存储器,尤其包括多个DDR3颗粒,所述DDR3颗粒的个数根据实际需求,优选为4或8,在FPGA的处理能力达到更高水平时,本发明的其他实施例中,所述DDR3颗粒的数量还可以随着需求已经FPGA的处理能力而增加,本实施例中对所述DDR3颗粒的数量不做限定。本实施例中所述的DDR3为一种计算机内存规格。它属于SDRAM(Synchronous Dynamic Random Access Memory,同步动态随机存储器)家族的内存产品,提供了相较于DDR2SDRAM更高的运行效能与更低的电压,是DDR2SDRAM的后继者,也是现时流行的内存产品规格。
存储单元的结构如图6所示。由n个DDR3颗粒并行组成,总线合并后与FPGA相连,实现数据的输入输出,采用并行结构的优点在于能增加n倍存储容量,提高n倍的存储带宽。
基于以上结构,本实施例提供的深存储器的总带宽为:
P=N×n×p——公式(3)
其中P表示深存储器总带宽,N表示深存储器的子存储器的个数,p表示单个动态存储器带宽,n表示存储单元中动态存储器的数量(n通常为4或8)。
高性能存储器的总存储容量:
D=N×n×d——公式(4)
其中D表示存储器总容量,N表示深存储器的子存储器的个数,d表示单个动态存储器的容量大小,n表示存储单元中动态存储器的数量。
动态存储器的带宽计算如下:
p=w×r(bps)——公式(5)
其中p表示单个动态存储器带宽,w表示单个动态存储器的数据总线位宽,r表示动态存储器的数据速率。bps(bits per second)为带宽的单位,表示单位时间内的位速率。
本实施例中的动态存储器优选使用DDR3颗粒,其数据总线位宽16bits,数据频率1600MHz,容量2Gbits,每个存储单元的动态存储器数量n为4;则根据公式(3)和公式(5)可以计算出该存储器的总带宽为:
P=N×n×w×r=N×4×16bits×1600MHz=N×102.4Gbps——公式(6)
根据公式(4)可计算出总存储容量为:
D=N×n×d=N×4×2Gbits=N×8Gbits——公式(7)
由此可以看出,本实施例提供的深存储器,由于增加了子存储器的个数,随之增加了存储单元的个数,进而提高了深存储器的总存储容量和总带宽。由于增加的子存储器与现有的子存储器的之间为级联,在处理数据过程中,相当于流水线处理,多级子FPGA同时对数据进行处理,从而提高了数据处理的速度,进而提高了深存储器的处理性能。另外,在器件选择方面,FPGA优选为大规模FPGA,进一步提高了深存储器的处理性能,由于FPGA为大规模的FPGA,存储单元中DDR3颗粒的个数能够随之增加,进一步增大了深存储器的总存储容量和总带宽,从而解决了现有测量仪器存储方面容量小、带宽低和处理性能差的问题。
实施例二
本实施例中提供的深存储器,如图1所示,存储器个数为一个,存储器包括第一子存储器和第二子存储器;第一子存储器包括第一FPGA(即图中所示的FPGA1)和FPGA1相连的第一存储单元(即图中所示的存储单元1);第二子存储器包括第二FPGA2(即图中所示的FPGA2)和与FPGA2相连的第二存储单元(即图中所示的存储单元2);FPGA1与FPGA2级联,数据由FPGA1输入,由FPGA2输出。
本实施例中FPGA1和FPGA2均采用XC7K160T型号的FPGA。本领 域技术人员能够理解的,其他型号的逻辑资源在16万以上的FPGA同样也能应用于本实施例中,本实施例对此不做限定。存储单元1和存储单元2均包括并行的多个DDR3颗粒,多个DDR3的总线合并后与FPGA相连,实现数据的输入输出。本实施例中所述DDR3颗粒的个数为4,其数据总线位宽16bits,数据频率1600MHz,容量2Gbits。
则根据实施例一中公式(6),深存储器的总带宽为:
P=N×n×w×r=2×4×16bits×1600MHz=2×102.4Gps=204.8Gbps
则根据事实一中公式(7)深存储器的总存储容量为:
D=N×n×d=2×4×2Gbits=2×8Gbits=16Gbits
相对于传统技术中只有一个子存储器,且FPGA为小规模的,资源有限,存储单元只能设置一个DDR3颗粒而言,本实施例提供的深存储器的总带宽和总存储容量均为传统技术中的8倍,处理性能为传统技术存储器的处理性能的L2/L1倍。
实施例三
本实施例提供一种深存储器,如图7所示,存储器个数为两个,存储器包括第一子存储器、第二子存储器、第三子存储器和第四子存储器;第一子存储器包括第一FPGA(即图中所示的FPGA1)和与FPGA1相连的第一存储单元(即图中所示的存储单元1);第二子存储器包括第二FPGA(即图中所示的FPGA2)和与FPGA2相连的第二存储单元(即图中所示的存储单元2);第三子存储器包括第三FPGA(即图中所示的FPGA3)和与FPGA3相连的第三存储单元(即图中所示的存储单元3);第四子存储器包括第四FPGA(即图中所示的FPGA4)和与FPGA4相连的第四存储单元(即图中所示的存储单元4);FPGA1与FPGA2级联,数据由FPGA1输入,由FPGA2输出;FPGA3与FPGA4级联,数据由FPGA3输入,由FPGA4输出。
本实施例中的深存储器由4个子存储器组成,两两组合形成一个存储器,然后再并行连接,形成两路并行数据处理通信结构,且每个子存储器为独立的模块,数据处理过程不受其他子存储器的影响,体现了并行处理的特性。
本实施例中优选地,FPGA1、FPGA2、FPGA3和FPGA4均采用大规模的FPGA,优选均采用XC7K160T型号的FPGA。本领域技术人员能够理解的,其他型号的逻辑资源在16万以上的FPGA同样也能应用于本实施例中,本实施例对此不做限定。存储单元1、存储单元2、存储单元3和存储单元4均包括并行的多个DDR3颗粒,多个DDR3的总线合并后与FPGA相连,实现数据的输入输出。本实施例中存储单元1、存储单元2、存储单元3和存储单元4中DDR3颗粒的个数均为4,其数据总线位宽16bits,数据频率1600MHz,容量2Gbits。
根据公式(6),则本实施例中提供的深存储器的总存储容量为:
P=N×n×w×r=4×4×16bits×1600MHz=4×102.4Gps=409.6Gbps
根据公式(7),则本实施例中提供的深存储器的总带宽:
D=N×n×d=4×4×2Gbits=4×8Gbits=32Gbits
相对于传统技术中只有一个子存储器,且FPGA为小规模的,资源有限,存储单元只能设置一个DDR3颗粒而言,本实施例提供的深存储器的总带宽和总存储容量均为传统技术中的16倍,处理性能为传统技术存储器的处理性能的L2/L1倍。
另外,需要说明的是,为了最后将两路并行处理的输出数据合并输出,本实施例中还包括第五FPGA(即图中所示FPGA5),数据由FPGA1和FPGA3输入并处理,然后分别送至FPGA2和FPGA4处理,最后汇总到FPGA5完成数据处理,并输出。本实施例中FPGA5也优选采用XC7K160T型号的FPGA,从而提高深存储器的处理性能。
实施例四
本发明实施例提供一种测量仪器,包括深存储器,所述深存储器为实施例一、实施例二或实施例三提供的深存储器,包括至少一个存储器,所述存储器包括至少两个子存储器,每个所述子存储器包括一个FPGA和一个与所述FPGA相连的存储单元,所述至少两个子存储器的FPGA级联。其中涉及的FPGA均采用XC7K160T型号的FPGA,其逻辑资源能够达到16万,处理速度能够进一步加快,且其数据处理架构相对于现有技术中的FPGA 发生变化,进而提升了处理性能。本领域技术人员能够理解的,其他型号的逻辑资源在16万以上的FPGA同样也能应用于本实施例中,本实施例对此不做限定。对应的,所述存储单元均优选为包括多个DDR3颗粒,其中DDR3颗粒的数量不做限定。
需要说明的是,本实施例中不限定所述测量仪器的具体形式,只要其内部包含深存储器即可,优选的所述测量仪器为示波器或逻辑分析仪。
示波器采用多个DDR3颗粒存储技术后,配合大规模的FPGA器件,以及本发明提供的深存储器结构,能有效的解决传统仪器存储方案容量小、带宽低、性能差的问题。再通过全硬件加速和多线程并行处理,存储深度最高可达512Mpts的国际领先水平,且处理性能高,进而使示波器在观察长时间波形时也不丢失波形细节。解决了传统通用测量仪器深存储技术的一大难题。
同样的,逻辑分析仪采用大规模FPGA、多个DDR3颗粒以及本发明提供的深存储器结构后,能有效的解决传统仪器存储方案容量小、带宽低、性能差的问题。
需要说明的是,本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (9)

  1. 一种深存储器,其特征在于,包括:
    至少一个存储器,所述存储器包括至少两个子存储器,每个所述子存储器包括一个现场可编程逻辑阵列FPGA和一个与所述FPGA相连的存储单元,所述至少两个子存储器的FPGA级联。
  2. 根据权利要求1所述的深存储器,其特征在于,所述FPGA为逻辑资源在16万以上的FPGA。
  3. 根据权利要求2所述的深存储器,其特征在于,所述存储单元包括并行的多个DDR3颗粒,多个DDR3的总线合并后与FPGA相连,实现数据的输入输出。
  4. 根据权利要求3所述的深存储器,其特征在于,所述DDR3颗粒的个数为4或8。
  5. 根据权利要求1所述的深存储器,其特征在于,
    所述存储器个数为一个,所述存储器包括第一子存储器和第二子存储器;
    所述第一子存储器包括第一FPGA和与所述第一FPGA相连的第一存储单元;
    所述第二子存储器包括第二FPGA和与所述第二FPGA相连的第二存储单元;
    所述第一FPGA与所述第二FPGA级联,数据由所述第一FPGA输入,由所述第二FPGA输出。
  6. 根据权利要求1所述的深存储器,其特征在于,
    所述存储器个数为两个,所述存储器包括第一子存储器、第二子存储器、第三子存储器和第四子存储器;
    所述第一子存储器包括第一FPGA和与所述第一FPGA相连的第一存储单元;
    所述第二子存储器包括第二FPGA和与所述第二FPGA相连的第二存储单元;
    所述第三子存储器包括第三FPGA和与所述第三FPGA相连的第三存储单元;
    所述第四子存储器包括第四FPGA和与所述第四FPGA相连的第四存储单元;
    所述第一FPGA与所述第二FPGA级联,数据由所述第一FPGA输入,由所述第二FPGA输出;
    所述第三FPGA与所述第四FPGA级联,数据由所述第三FPGA输入,由所述第四FPGA输出。
  7. 根据权利要6所述的深存储器,其特征在于,还包括第五FPGA,所述第五FPGA的输入端分别连接所述第二FPGA的输出端和所述第四FPGA的输出端,数据由所述第五FPGA输出。
  8. 一种测量仪器,其特征在于,包括权利要求1-7任意一项所述的深存储器。
  9. 根据权利要求8所述的测量仪器,其特征在于,所述测量仪器为示波器或逻辑分析仪。
PCT/CN2017/088649 2016-09-26 2017-06-16 一种深存储器以及测量仪器 WO2018054104A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610851332.X 2016-09-26
CN201610851332.XA CN106502580B (zh) 2016-09-26 2016-09-26 一种深存储器以及测量仪器

Publications (1)

Publication Number Publication Date
WO2018054104A1 true WO2018054104A1 (zh) 2018-03-29

Family

ID=58290678

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/088649 WO2018054104A1 (zh) 2016-09-26 2017-06-16 一种深存储器以及测量仪器

Country Status (2)

Country Link
CN (1) CN106502580B (zh)
WO (1) WO2018054104A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106502580B (zh) * 2016-09-26 2019-02-26 广州致远电子股份有限公司 一种深存储器以及测量仪器
CN108320022A (zh) * 2018-01-23 2018-07-24 深圳市易成自动驾驶技术有限公司 深度学习系统构建方法、装置、深度学习系统及存储介质
CN109062514B (zh) * 2018-08-16 2021-08-31 郑州云海信息技术有限公司 一种基于命名空间的带宽控制方法、装置和存储介质
CN109274941B (zh) * 2018-10-23 2020-12-18 合肥博焱智能科技有限公司 基于fpga的多路视频解码人脸检测识别方法
CN112068467B (zh) * 2020-08-24 2022-01-14 国微集团(深圳)有限公司 数据传输系统、数据存储系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103616934A (zh) * 2013-12-06 2014-03-05 江南大学 一种fpga核心电路板结构
CN204347812U (zh) * 2014-12-30 2015-05-20 上海师范大学 一种基于fpga的服务器存储电路
CN104931755A (zh) * 2015-05-08 2015-09-23 中国电子科技集团公司第四十一研究所 一种高分辨率数字存储示波器
US9153311B1 (en) * 2014-05-27 2015-10-06 SRC Computers, LLC System and method for retaining DRAM data when reprogramming reconfigurable devices with DRAM memory controllers
CN105376061A (zh) * 2015-10-10 2016-03-02 广州慧睿思通信息科技有限公司 一种基于fpga的解密硬件平台
CN105677594A (zh) * 2016-01-20 2016-06-15 中国人民解放军国防科学技术大学 Ddr3接口中的fpga设备的复位、读写校准方法及设备
CN106502580A (zh) * 2016-09-26 2017-03-15 广州致远电子股份有限公司 一种深存储器以及测量仪器

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103616934A (zh) * 2013-12-06 2014-03-05 江南大学 一种fpga核心电路板结构
US9153311B1 (en) * 2014-05-27 2015-10-06 SRC Computers, LLC System and method for retaining DRAM data when reprogramming reconfigurable devices with DRAM memory controllers
CN204347812U (zh) * 2014-12-30 2015-05-20 上海师范大学 一种基于fpga的服务器存储电路
CN104931755A (zh) * 2015-05-08 2015-09-23 中国电子科技集团公司第四十一研究所 一种高分辨率数字存储示波器
CN105376061A (zh) * 2015-10-10 2016-03-02 广州慧睿思通信息科技有限公司 一种基于fpga的解密硬件平台
CN105677594A (zh) * 2016-01-20 2016-06-15 中国人民解放军国防科学技术大学 Ddr3接口中的fpga设备的复位、读写校准方法及设备
CN106502580A (zh) * 2016-09-26 2017-03-15 广州致远电子股份有限公司 一种深存储器以及测量仪器

Also Published As

Publication number Publication date
CN106502580B (zh) 2019-02-26
CN106502580A (zh) 2017-03-15

Similar Documents

Publication Publication Date Title
WO2018054104A1 (zh) 一种深存储器以及测量仪器
CN105044420B (zh) 一种数字示波器的波形搜索方法
US7574319B2 (en) Instrument architecture with circular processing queue
Khedkar et al. High speed FPGA-based data acquisition system
WO2015127796A1 (zh) 一种处理串行任务的数据处理装置及方法
CN107133011A (zh) 一种示波记录仪的多通道数据存储方法
CN111258535B (zh) 一种用于fpga实现的排序方法
WO2022199027A1 (zh) 一种随机写的方法、电子设备及存储介质
Tanasic et al. Comparison based sorting for systems with multiple GPUs
CN104537003A (zh) 一种Hbase数据库的通用高性能数据写入方法
US10915467B2 (en) Scalable, parameterizable, and script-generatable buffer manager architecture
CN100458973C (zh) 高速流水线中长延时多端口sram的快速访问方法
Bayne et al. OpenForensics: A digital forensics GPU pattern matching approach for the 21st century
CN106443115B (zh) 一种基于深度存储的示波器
CN106055496B (zh) 一种eeprom控制器的信号生成电路及控制方法
CN205721754U (zh) 矩阵数据转置装置
CN115220528A (zh) 时钟获得方法、装置、芯片、电子设备及存储介质
CN101961248A (zh) 一种超声系统中非线性压缩的方法与装置
CN102231140B (zh) 一种基于双口ram的数据包络获取方法
Sundar et al. Algorithms for high-throughput disk-to-disk sorting
WO2012167396A1 (en) An innovative structure for the register group
Zhang et al. Design of audio signal processing and display system based on SoC
Ogunlere et al. Performance Analysis of Different Memory Element Configurations
Jin et al. Design of spaceborne SAR imaging processing and fast Verification Based on FPGA
CN110134641A (zh) 一种大数据处理装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17852177

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 190719)

122 Ep: pct application non-entry in european phase

Ref document number: 17852177

Country of ref document: EP

Kind code of ref document: A1