WO2022016946A1 - Shared caching method, baseband processing unit, and chip thereof - Google Patents

Shared caching method, baseband processing unit, and chip thereof Download PDF

Info

Publication number
WO2022016946A1
WO2022016946A1 PCT/CN2021/090998 CN2021090998W WO2022016946A1 WO 2022016946 A1 WO2022016946 A1 WO 2022016946A1 CN 2021090998 W CN2021090998 W CN 2021090998W WO 2022016946 A1 WO2022016946 A1 WO 2022016946A1
Authority
WO
WIPO (PCT)
Prior art keywords
cache
capture
shared
subsystem
control
Prior art date
Application number
PCT/CN2021/090998
Other languages
French (fr)
Chinese (zh)
Inventor
朱佳
沈家瑞
丁杰
蒋云翔
文承淦
刘勇
黄维
陈宇
Original Assignee
长沙海格北斗信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 长沙海格北斗信息技术有限公司 filed Critical 长沙海格北斗信息技术有限公司
Publication of WO2022016946A1 publication Critical patent/WO2022016946A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7807System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
    • G06F15/781On-chip cache; Off-chip memory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the invention belongs to the field of chip design, and in particular relates to a shared cache method, a baseband processing unit and a chip thereof.
  • the baseband processing unit mainly includes two parts: the capture subsystem and the tracking subsystem.
  • the high-precision navigation chip needs to support the feature of simultaneous tracking of multi-frequency points, and multiple tracking subsystems are introduced.
  • a large buffer of tracking samples needs to be introduced.
  • the design of the capture module to improve the capture sensitivity, it is necessary to introduce a large capture sampling point buffer.
  • separate design of the sampling point buffer for capture and tracking will bring a lot of area and power consumption.
  • the baseband processing unit mainly includes two parts: acquisition and tracking:
  • each tracking subsystem is designed with 4 physical related channels, and supports up to 16 logical channels for simultaneous tracking through multiplexing.
  • the sampling points are written to the trace sampling point buffer.
  • the sampling rate is up to 80MHz.
  • a specific channel is selected for multiple sampling points according to the configuration, and is written into the capture sampling point buffer after the preprocessing of the capture.
  • the capture algorithm processing unit repeatedly reads the data buffered by the capture sampling point for coherent cumulative integration and matching selection. The time of coherent accumulation and integration will affect the capture sensitivity, and a longer integration time will result in higher capture sensitivity.
  • a typical capture sample buffer is configured with a capacity of 512KB.
  • the total capacity requirement of all sampling point buffers is 1MB, which leads to a sharp increase in the area and power consumption of chip design.
  • tracking with high sampling rate, concurrent tracking of all subsystems, and multiplexing of all tracking channels therefore, in the traditional cache-independent design, the utilization efficiency of the sampling point cache is low.
  • One of the objectives of the present invention is to provide a shared cache method capable of effectively reducing cache capacity, improving cache utilization, high reliability and good practicability.
  • Another object of the present invention is to provide a baseband processing unit including the shared cache method.
  • the third object of the present invention is to provide a chip including the shared cache method and a baseband processing unit.
  • the shared cache method provided by the present invention includes the following steps:
  • step S3 According to the shared buffer area designed in step S2, trace access control, capture access control and cache clock control are performed.
  • the shared buffer area obtained in step S1 is designed, specifically, there are 8 tracking subsystems and 1 capturing subsystem; each tracking subsystem has 1 write request, 4 Read requests, and 5 requests of each tracking subsystem access the same cache area at the same time; the capture subsystem has 1 write request and 1 read request, and the 2 requests of the capture subsystem access the same batch in time-sharing Cache area; the shared cache area is designed with a total of 640KB, and is divided into 40 cache units, each of which is 16KB.
  • step S3 The tracking access control described in step S3 is specifically controlled by the following steps:
  • the tracking access control is divided into control flow control, write flow control and read flow control;
  • control flow control control the cache space address, and divide the system time window into several control segments;
  • step S3 The capture access control described in step S3 is specifically controlled by the following steps:
  • step S3 The buffer clock control described in step S3 is specifically controlled by the following steps:
  • the clock of the cache unit is automatically turned on; when the cache unit is released, the clock of the cache unit is automatically turned off.
  • the present invention also provides a baseband processing unit, which includes the above-mentioned shared cache method.
  • the present invention also provides a chip, which includes the above-mentioned shared cache method and baseband processing unit.
  • the shared cache method, the baseband processing unit and the chip thereof provided by the present invention effectively improve the utilization rate of the sampling point cache and effectively reduce the cache capacity by sharing the cache unit and controlling the shared cache unit; At the same time, the invention can effectively reduce the cache area of the chip, which is beneficial to the miniaturization design of the chip; meanwhile, the invention improves the utilization rate and uniformity of the cache design, reduces the power consumption of the cache, and has high reliability and good practicability.
  • FIG. 1 is a schematic diagram of functional modules of a baseband processing unit in an existing high-precision navigation chip.
  • FIG. 2 is a schematic flow chart of the method of the present invention.
  • FIG. 3 is a functional block diagram of the hardware implementation of the method of the present invention.
  • FIG. 4 is a schematic diagram of functional modules of the shared cache unit of the method of the present invention.
  • FIG. 5 is a schematic flowchart of a method for tracking access control according to the method of the present invention.
  • FIG. 6 is a schematic diagram of the configuration of a cache array according to an embodiment of the method of the present invention.
  • FIG. 2 is a schematic flow chart of the method of the present invention: this shared cache method provided by the present invention includes the following steps:
  • the technical solution shown in Figure 4 can be adopted: there are 8 tracking subsystems and 1 capturing subsystem; each tracking subsystem has 1 write request, 4 read requests, and each track 5 requests (1 write request and 4 read requests) of the tracking subsystem access the same cache area at the same time; the capture subsystem has 1 write request, 1 read request, and 2 of the capture subsystem Requests (1 write request and 1 read request) access the same batch of cache areas in time-sharing; the shared cache area is designed to be 640KB in total and divided into 40 cache units, each of which is 16KB;
  • step S3 According to the shared cache area designed in step S2, track access control (as shown in Figure 5), capture access control and cache clock control;
  • Tracking access control Up to 8 tracking subsystems work at the same time, and each subsystem needs to be allocated an independent cache space. Each subsystem has different sampling point rate requirements, so the size of the cache space may be different, and the cache spaces cannot overlap each other; There are 1 write request and 4 read requests in each subsystem, which will access the same cache unit at the same time, so time-sharing control is required;
  • the tracking access control is divided into control flow control, write flow control and read flow control;
  • control flow control control the cache space address, and divide the system time window into several control segments;
  • base addr indicates the allocation base address
  • buf size indicates the allocation cache capacity
  • slice_cnt indicates the time window count
  • sample_vld represents the valid flag of the sampling point
  • sample_cnt represents the count of the valid flag of the sampling point
  • sample data joint represents the data splicing value of the sampling point
  • write buffer represents the write cache unit
  • read_req[n] indicates that the nth channel initiates a read request
  • read_flag[n] indicates that the nth channel is currently reading data
  • slice_cnt indicates the time window count
  • read buffer indicates the read buffer unit
  • send samples indicates sending sampling point data
  • Capture access control is controlled using the following steps:
  • the operation bit width of the capture algorithm processing access to the capture cache is 256 bits, so the unit of the capture operation to allocate the shared cache is 4 cache units, and the user software needs to allocate space.
  • the buffer clock control is controlled by the following steps:
  • the clock of the cache unit is automatically turned on; when the cache unit is released, the clock of the cache unit is automatically turned off, thereby reducing power consumption.
  • the user configures the tracking subsystem 1 to allocate 4 buffer units, the tracking subsystem 2 to allocate 6 buffer units, and the capture subsystem to allocate 16 buffer units.
  • the configuration of the buffer array is shown in Figure 6.
  • the sampling point buffers of the tracking system and the capture system are uniformly designed and divided, the buffer space is dynamically allocated to each system by the user software, and the clock switch of each buffer unit is automatically managed by the logic, which can reduce the overall area of the buffer and improve the buffer space.
  • the utilization rate of the chip is reduced, and the power consumption of the chip is reduced, which has a high promotion value; its value is mainly reflected in the following aspects: (1) The cache area of the chip is effectively reduced, and the total cache capacity is reduced to 62.5%, while satisfying the vast majority of The demand for cache in the scene reduces the overall area of the chip, which is conducive to the miniaturized design of the chip, and further provides a foundation for the portability of the product; (2) Improve the utilization and uniformity of the cache design, for different subsystems Allocate different cache sizes and bandwidths, which effectively improves the utilization rate.
  • each cache unit is uniform, which also improves the simplicity of design and reduces the difficulty of back-end design; (3) Reduces cache power consumption and automatically monitors through logic Whether each cache unit is allocated for use, automatically turns on or off the clock of each cache unit, realizes refined management of power consumption, and effectively reduces the power consumption of the chip.

Abstract

Disclosed is a shared caching method, comprising: setting a shared cache area shared by a capture subsystem and a plurality of tracking subsystems; designing the shared cache area according to the number of access requests; and performing tracking access control, capture access control, and cache clock control. Further disclosed are a baseband processing unit comprising the shared cache method above, and a chip comprising the shared cache method and the baseband processing unit. By sharing a cache unit and controlling the shared cache unit, the present invention effectively improves the utilization rate of a sampling point cache and effectively reduces the capacity of a cache. At the same time, the present invention can effectively reduce the cache area of the chip, which facilitates the miniaturized design of the chip. In addition, the present invention improves the utilization rate and unity of the design of a cache, reduces the power consumption of the cache, and has high reliability and good practicability.

Description

共享缓存方法、基带处理单元及其芯片Shared cache method, baseband processing unit and chip thereof 技术领域technical field
本发明属于芯片设计领域,具体涉及一种共享缓存方法、基带处理单元及其芯片。The invention belongs to the field of chip design, and in particular relates to a shared cache method, a baseband processing unit and a chip thereof.
背景技术Background technique
随着经济技术的发展和人们生活水平的提高,导航已经成为了人们生产和生活中必不可少的辅助功能,给人们的生产和生活带来了无尽的便利。With the development of economy and technology and the improvement of people's living standards, navigation has become an indispensable auxiliary function in people's production and life, bringing endless convenience to people's production and life.
在高精度导航芯片中,基带处理单元主要包含捕获子系统和跟踪子系统两个部分。为支持多系统多频点的应用场景,尤其是针对定位定向的高端需求,高精度导航芯片需要支持多频点同时跟踪的特性,引入多个跟踪子系统。在每个跟踪子系统中为支持多通道特性,需要引入大的跟踪采样点缓存。在捕获模块设计中为提升捕获的灵敏度,需要引入大的捕获采样点缓存。在传统的基带处理方法中,将捕获和跟踪的采样点缓存单独设计,会带来很大的面积和功耗消耗,其典型方案框图如图1所示。In the high-precision navigation chip, the baseband processing unit mainly includes two parts: the capture subsystem and the tracking subsystem. In order to support the application scenarios of multi-system and multi-frequency points, especially for the high-end requirements of positioning and orientation, the high-precision navigation chip needs to support the feature of simultaneous tracking of multi-frequency points, and multiple tracking subsystems are introduced. To support the multi-channel feature in each tracking subsystem, a large buffer of tracking samples needs to be introduced. In the design of the capture module, to improve the capture sensitivity, it is necessary to introduce a large capture sampling point buffer. In the traditional baseband processing method, separate design of the sampling point buffer for capture and tracking will bring a lot of area and power consumption. The block diagram of a typical solution is shown in Figure 1.
基带处理单元主要包括捕获和跟踪两部分:The baseband processing unit mainly includes two parts: acquisition and tracking:
跟踪模块的典型设计中,引入8个跟踪子系统,可支持8个频点的同时跟踪。每个跟踪子系统内设计4个物理相关通道,通过复用最多支持16个逻辑通道同时跟踪。采样点在经过预处理后,写入跟踪采样点缓存。为获得好的跟踪灵敏度特性,采样速率最高为80MHz。为了支持高采样率和通道的相关复用,跟踪采样点缓存设计容量为64KB。因此全部跟踪子系统的采样点缓存容量为64K*8=512KB;In the typical design of the tracking module, 8 tracking subsystems are introduced, which can support the simultaneous tracking of 8 frequency points. Each tracking subsystem is designed with 4 physical related channels, and supports up to 16 logical channels for simultaneous tracking through multiplexing. After the sampling points are preprocessed, they are written to the trace sampling point buffer. For good tracking sensitivity characteristics, the sampling rate is up to 80MHz. In order to support high sampling rate and related multiplexing of channels, the design capacity of trace sampling point buffer is 64KB. Therefore, the sampling point buffer capacity of all tracking subsystems is 64K*8=512KB;
捕获模块的典型设计中,多路采样点根据配置选择特定一路,经过捕获的预处理写入捕获采样点缓存。捕获算法处理单元,反复读取捕获采样点缓存的数据进行相干累计积分和匹配选择等处理。而相干累加积分的时间会影响捕获灵敏度,更长的积分时间会获得更高的捕获灵敏度。典型捕获采样点缓存的容量配置为512KB。In the typical design of the capture module, a specific channel is selected for multiple sampling points according to the configuration, and is written into the capture sampling point buffer after the preprocessing of the capture. The capture algorithm processing unit repeatedly reads the data buffered by the capture sampling point for coherent cumulative integration and matching selection. The time of coherent accumulation and integration will affect the capture sensitivity, and a longer integration time will result in higher capture sensitivity. A typical capture sample buffer is configured with a capacity of 512KB.
因此,在现有技术中,全部采样点缓存的容量总需求为1MB,导致芯片设计的面积和功耗急剧增加。在实际应用中,跟踪高采样率,跟踪子系统全部并发,跟踪全通道复用;因此传统的缓存独立设计中,采样点缓存的利用效率较低。Therefore, in the prior art, the total capacity requirement of all sampling point buffers is 1MB, which leads to a sharp increase in the area and power consumption of chip design. In practical applications, tracking with high sampling rate, concurrent tracking of all subsystems, and multiplexing of all tracking channels; therefore, in the traditional cache-independent design, the utilization efficiency of the sampling point cache is low.
发明内容SUMMARY OF THE INVENTION
本发明的目的之一在于提供一种能够有效减少缓存容量、提高缓存利用率、可靠性高且实用性好的共享缓存方法。One of the objectives of the present invention is to provide a shared cache method capable of effectively reducing cache capacity, improving cache utilization, high reliability and good practicability.
本发明的目的之二在于提供一种包括了所述共享缓存方法的基带处理单元。Another object of the present invention is to provide a baseband processing unit including the shared cache method.
本发明的目的之三在于提供一种包括了所述共享缓存方法和基带处理单元的芯片。The third object of the present invention is to provide a chip including the shared cache method and a baseband processing unit.
本发明提供的这种共享缓存方法,包括如下步骤:The shared cache method provided by the present invention includes the following steps:
S1.设置捕获子系统和若干个跟踪子系统共有的共享缓存区;S1. Set the shared buffer area shared by the capture subsystem and several tracking subsystems;
S2.根据访问请求数量,对步骤S1得到的共享缓存区进行设计;具体为共有A路跟踪子系统和B路捕获子系统;每一路跟踪子系统有a1个写入请求,a2个读取请求,且每一路跟踪子系统的a1+a2个请求同时访问同一片缓存区;每一路捕获子系统有b1个写入请求,b2个读取请求,且每一路捕获子系统的b1+b2个请求分时访问同一批缓存区间;共享缓存区间共设计C KB,并划分为D个缓存单元,每个缓存单元为E KB;A、B、a1、a2、b1、b2、C、D和E均为正整数,且E=C/D;S2. According to the number of access requests, design the shared buffer area obtained in step S1; specifically, there are A-way tracking subsystem and B-way capture subsystem; each tracking subsystem has a1 write requests and a2 read requests , and the a1+a2 requests of each tracking subsystem access the same buffer at the same time; each capture subsystem has b1 write requests, b2 read requests, and each capture subsystem has b1+b2 requests Time-sharing access to the same batch of cache areas; a total of C KB is designed for the shared cache area, and divided into D cache units, each of which is E KB; A, B, a1, a2, b1, b2, C, D, and E are all is a positive integer, and E=C/D;
S3.根据步骤S2设计的共享缓存区,进行跟踪访问控制、捕获访问控制和缓存时钟控制。S3. According to the shared buffer area designed in step S2, trace access control, capture access control and cache clock control are performed.
步骤S2所述的根据访问请求数量,对步骤S1得到的共享缓存区进行设计,具体为共有8路跟踪子系统和1路捕获子系统;每一路跟踪子系统有1个写入请求,4个读取请求,且每一路跟踪子系统的5个请求同时访问同一片缓存区间;捕获子系统有1个写入请求,1个读取请求,且捕获子系统的2个请求分时访问同一批缓存区间;共享缓存区间共设计640KB,并划分为40个缓存单元,每个缓存单元为16KB。According to the number of access requests described in step S2, the shared buffer area obtained in step S1 is designed, specifically, there are 8 tracking subsystems and 1 capturing subsystem; each tracking subsystem has 1 write request, 4 Read requests, and 5 requests of each tracking subsystem access the same cache area at the same time; the capture subsystem has 1 write request and 1 read request, and the 2 requests of the capture subsystem access the same batch in time-sharing Cache area; the shared cache area is designed with a total of 640KB, and is divided into 40 cache units, each of which is 16KB.
步骤S3所述的跟踪访问控制,具体为采用如下步骤进行控制:The tracking access control described in step S3 is specifically controlled by the following steps:
将跟踪访问控制分为控制流程控制、写流程控制和读流程控制;The tracking access control is divided into control flow control, write flow control and read flow control;
对于控制流程控制:进行缓存空间地址的控制,并将系统时间窗分为若干个控制片段;For control flow control: control the cache space address, and divide the system time window into several control segments;
对于写流程控制:控制采样点数据的拼接,并在最后一个控制片段的时隙将拼接的采样点数据写入缓存单元;For write flow control: control the splicing of sampling point data, and write the spliced sampling point data into the cache unit in the time slot of the last control segment;
对于读流程控制:分为4个并行通道,4个并行通道相互独立工作,并满足4个相关器同时工作的采样点带宽;当某通道相关器发起读请求时,在对应的控制时隙内控制定时读缓存单元,对数据进行拆分后按顺序返回相关器。For read flow control: divided into 4 parallel channels, 4 parallel channels work independently of each other, and meet the sampling point bandwidth of 4 correlators working at the same time; when a channel correlator initiates a read request, in the corresponding control time slot Control the timing read cache unit, split the data and return it to the correlator in order.
步骤S3所述的捕获访问控制,具体为采用如下步骤进行控制:The capture access control described in step S3 is specifically controlled by the following steps:
配置捕获子系统使用的缓存起始地址和空间容量,且保证不与跟踪子系统的缓存空间重叠;捕获采样点预处理后将数据写入捕获缓存,等待采集设定的采样点后,反复从捕获缓存读取数据进行计算,最终输出捕获结果并释放捕获缓存。Configure the start address and space capacity of the cache used by the capture subsystem, and ensure that it does not overlap with the cache space of the tracking subsystem; after the capture sampling point is preprocessed, write the data into the capture buffer, wait for the set sampling point to be collected, and repeat from The capture buffer reads the data for calculation, and finally outputs the capture result and releases the capture buffer.
步骤S3所述的缓存时钟控制,具体为采用如下步骤进行控制:The buffer clock control described in step S3 is specifically controlled by the following steps:
单独配置每个缓存单元的时钟;Configure the clock of each cache unit individually;
根据缓存单元的配置,动态切换每个缓存单元的时钟使能;According to the configuration of the cache unit, dynamically switch the clock enable of each cache unit;
当某个缓存单元分配给某个子系统时,自动将该缓存单元的时钟打开;当该缓存单元释放后,自动关闭该缓存单元的时钟。When a cache unit is assigned to a subsystem, the clock of the cache unit is automatically turned on; when the cache unit is released, the clock of the cache unit is automatically turned off.
本发明还提供了一种基带处理单元,该基带处理单元包括了上述的共享缓存方法。The present invention also provides a baseband processing unit, which includes the above-mentioned shared cache method.
本发明还提供了一种芯片,该芯片包括了上述的共享缓存方法和基带处理单元。The present invention also provides a chip, which includes the above-mentioned shared cache method and baseband processing unit.
本发明提供的这种共享缓存方法、基带处理单元及其芯片,通过共享缓存单元,并对共享的缓存单元进行控制的方式,有效提高了采样点缓存的利用率,并有效减少了缓存容量;同时,本发明能够有效降低芯片的缓存面积,有利于芯片的小型化设计;同时,本发明提高了缓存设计的利用率和统一性,而且降低了缓存功耗,可靠性高且实用性好。The shared cache method, the baseband processing unit and the chip thereof provided by the present invention effectively improve the utilization rate of the sampling point cache and effectively reduce the cache capacity by sharing the cache unit and controlling the shared cache unit; At the same time, the invention can effectively reduce the cache area of the chip, which is beneficial to the miniaturization design of the chip; meanwhile, the invention improves the utilization rate and uniformity of the cache design, reduces the power consumption of the cache, and has high reliability and good practicability.
附图说明Description of drawings
图1为现有的高精度导航芯片中基带处理单元的功能模块示意图。FIG. 1 is a schematic diagram of functional modules of a baseband processing unit in an existing high-precision navigation chip.
图2为本发明方法的方法流程示意图。FIG. 2 is a schematic flow chart of the method of the present invention.
图3为本发明方法的硬件实现功能框图。FIG. 3 is a functional block diagram of the hardware implementation of the method of the present invention.
图4为本发明方法的共享缓存单元的功能模块示意图。FIG. 4 is a schematic diagram of functional modules of the shared cache unit of the method of the present invention.
图5为本发明方法的跟踪访问控制的方法流程示意图。FIG. 5 is a schematic flowchart of a method for tracking access control according to the method of the present invention.
图6为本发明方法的实施例的缓存阵列配置示意图。FIG. 6 is a schematic diagram of the configuration of a cache array according to an embodiment of the method of the present invention.
具体实施方式detailed description
如图2所示为本发明方法的方法流程示意图:本发明提供的这种共享缓存方法,包括如下步骤:Figure 2 is a schematic flow chart of the method of the present invention: this shared cache method provided by the present invention includes the following steps:
S1.设置捕获子系统和若干个跟踪子系统共有的共享缓存区(如图3所示);S1. Set the shared buffer area shared by the capture subsystem and several tracking subsystems (as shown in Figure 3);
S2.根据访问请求数量,对步骤S1得到的共享缓存区进行设计;具体为共有A路跟踪子系统和B路捕获子系统;每一路跟踪子系统有a1个写入请求,a2个读取请求,且每一路跟踪子系统的a1+a2个请求同时访问同一片缓存区;每一路捕获子系统有b1个写入请求,b2个读取请求,且每一路捕获子系统的b1+b2个请求分时访问同一批缓存区间;共享缓存区间共设计C KB,并划分为D个缓存单元,每个缓存单元为E KB;A、B、a1、a2、b1、b2、C、D和E均为正整数,且E=C/D;S2. According to the number of access requests, design the shared buffer area obtained in step S1; specifically, there are A-way tracking subsystem and B-way capture subsystem; each tracking subsystem has a1 write requests and a2 read requests , and the a1+a2 requests of each tracking subsystem access the same buffer at the same time; each capture subsystem has b1 write requests, b2 read requests, and each capture subsystem has b1+b2 requests Time-sharing access to the same batch of cache areas; a total of C KB is designed for the shared cache area, and divided into D cache units, each of which is E KB; A, B, a1, a2, b1, b2, C, D, and E are all is a positive integer, and E=C/D;
在具体实施时,可以采用如图4所示的技术方案:共有8路跟踪子系统和1路捕获子系统;每一路跟踪子系统有1个写入请求,4个读取请求,且每一路跟踪子系统的5个请求(1个写入请求和4个读取请求)同时访问同一片缓存区间;捕获子系统有1个写入请求,1个读取请求,且捕获子系统的2个请求(1个写入请求和1个读取请求)分时访问同一批缓存区间;共享缓存区间共设计640KB,并划分为40个缓存单元,每个缓存单元为16KB;In specific implementation, the technical solution shown in Figure 4 can be adopted: there are 8 tracking subsystems and 1 capturing subsystem; each tracking subsystem has 1 write request, 4 read requests, and each track 5 requests (1 write request and 4 read requests) of the tracking subsystem access the same cache area at the same time; the capture subsystem has 1 write request, 1 read request, and 2 of the capture subsystem Requests (1 write request and 1 read request) access the same batch of cache areas in time-sharing; the shared cache area is designed to be 640KB in total and divided into 40 cache units, each of which is 16KB;
S3.根据步骤S2设计的共享缓存区,进行跟踪访问控制(如图5所示)、捕获访问控制和缓存时钟控制;S3. According to the shared cache area designed in step S2, track access control (as shown in Figure 5), capture access control and cache clock control;
跟踪访问控制:最多8个跟踪子系统同时工作,需要为每个子系统分配独立的缓存空间,每个子系统有不同的采样点速率需求,因此缓存空间的大小可能不同,且缓存空间不能相互重叠;每个子系统内有1个写入请求和4个读取请求,会同时访问同一个缓存单元,因此需要进行分时控制;Tracking access control: Up to 8 tracking subsystems work at the same time, and each subsystem needs to be allocated an independent cache space. Each subsystem has different sampling point rate requirements, so the size of the cache space may be different, and the cache spaces cannot overlap each other; There are 1 write request and 4 read requests in each subsystem, which will access the same cache unit at the same time, so time-sharing control is required;
在具体实施时,将跟踪访问控制分为控制流程控制、写流程控制和读流程控制;In the specific implementation, the tracking access control is divided into control flow control, write flow control and read flow control;
对于控制流程控制:进行缓存空间地址的控制,并将系统时间窗分为若干个控制片段;For control flow control: control the cache space address, and divide the system time window into several control segments;
对于写流程控制:控制采样点数据的拼接,并在最后一个控制片段的时隙将拼接的采样点数据写入缓存单元;For write flow control: control the splicing of sampling point data, and write the spliced sampling point data into the cache unit in the time slot of the last control segment;
对于读流程控制:分为4个并行通道,4个并行通道相互独立工作,并满足4个相关器同时工作的采样点带宽;当某通道相关器发起读请求时,在对应的控制时隙内控制定时读缓存单元,对数据进行拆分后按顺序返回相关器;For read flow control: divided into 4 parallel channels, 4 parallel channels work independently of each other, and meet the sampling point bandwidth of 4 correlators working at the same time; when a channel correlator initiates a read request, in the corresponding control time slot Control the timing read cache unit, split the data and return it to the correlator in order;
在图中:In the picture:
base addr表示分配基地址;buf size表示分配缓存容量;slice_cnt表示时间窗计数;base addr indicates the allocation base address; buf size indicates the allocation cache capacity; slice_cnt indicates the time window count;
sample_vld表示采样点有效标志;sample_cnt表示采样点有效标志的计数;sample data joint表示采样点数据拼接值;write buffer表示写入缓存单元;sample_vld represents the valid flag of the sampling point; sample_cnt represents the count of the valid flag of the sampling point; sample data joint represents the data splicing value of the sampling point; write buffer represents the write cache unit;
read_req[n]表示第n路发起读请求;read_flag[n]表示第n路当前正在读数据;slice_cnt表示时间窗计数;read buffer表示读取缓存单元;send samples表示发送采样点数据;read_req[n] indicates that the nth channel initiates a read request; read_flag[n] indicates that the nth channel is currently reading data; slice_cnt indicates the time window count; read buffer indicates the read buffer unit; send samples indicates sending sampling point data;
捕获访问控制为采用如下步骤进行控制:Capture access control is controlled using the following steps:
配置捕获子系统使用的缓存起始地址和空间容量,且保证不与跟踪子系统的缓存空间重叠;捕获采样点预处理后将数据写入捕获缓存,等待采集设定的采样点后,反复从捕获缓存读取数据进行计算,最终输出捕获结果并释放捕获缓存;Configure the start address and space capacity of the cache used by the capture subsystem, and ensure that it does not overlap with the cache space of the tracking subsystem; after the capture sampling point is preprocessed, write the data into the capture buffer, wait for the set sampling point to be collected, and repeat from The capture cache reads the data for calculation, and finally outputs the capture result and releases the capture cache;
在具体实施时,为满足捕获时间的要求,捕获算法处理访问捕获缓存的操作位宽为256bit,因此捕获操作分配共享缓存的单位为4个缓存单元,需要用户软件分配空间时按次要求处理;缓存时钟控制为采用如下步骤进行控制:In the specific implementation, in order to meet the requirements of the capture time, the operation bit width of the capture algorithm processing access to the capture cache is 256 bits, so the unit of the capture operation to allocate the shared cache is 4 cache units, and the user software needs to allocate space. The buffer clock control is controlled by the following steps:
单独配置每个缓存单元的时钟;Configure the clock of each cache unit individually;
根据缓存单元的配置,动态切换每个缓存单元的时钟使能;According to the configuration of the cache unit, dynamically switch the clock enable of each cache unit;
当某个缓存单元分配给某个子系统时,自动将该缓存单元的时钟打开;当该缓存单元释放后,自动关闭该缓存单元的时钟;从而降低功耗。When a certain cache unit is allocated to a certain subsystem, the clock of the cache unit is automatically turned on; when the cache unit is released, the clock of the cache unit is automatically turned off, thereby reducing power consumption.
以下通过一个典型应用,来说明本发明的优点。The advantages of the present invention are illustrated below through a typical application.
用户配置跟踪子系统一工作分配4个缓存单元,跟踪子系统二工作分配6个缓存单元,捕获子系统工作分配16个缓存单元,缓存阵列的配置如图6所示。The user configures the tracking subsystem 1 to allocate 4 buffer units, the tracking subsystem 2 to allocate 6 buffer units, and the capture subsystem to allocate 16 buffer units. The configuration of the buffer array is shown in Figure 6.
该应用中缓存单元总计40个缓存单元,分配使用的缓存单元26个,利用率为65%,未分配的缓存单元处于时钟关闭状态。In this application, there are 40 cache units in total, 26 cache units are allocated and used, and the utilization rate is 65%, and the unallocated cache units are in a clock-off state.
本发明方法将跟踪系统和捕获系统的采样点缓存统一设计划分,由用户软件动态分配给每个系统缓存空间,由逻辑自动管理每个缓存单元的时钟开关,可以减少缓存的整体面积,提高缓存的利用率,降低芯片的功耗,有很高的推广价值;其价值主要体现在以下几个方面:(1)有效降低芯片的缓存面积,缓存总容量降至62.5%,同时满足绝大多数场景对缓存的需求,降低了芯片的整体面积,有利于芯片的小型化设计,进一步为产品的便携性提供了基础;(2)提高了缓存设计的利用率和统一性,针对不同的子系统分配不同的缓存尺寸和带宽,有效提高了利用率,每个缓存单元尺寸统一,也提高了设计的简洁性,降低了后端设计的难度;(3)降低了缓存功耗,通过逻辑自动监测每个缓存单元是否分配使用,自动打开或关闭每个缓存单元的时钟,实现功耗的精细化管理,有效降低芯片的功耗。In the method of the invention, the sampling point buffers of the tracking system and the capture system are uniformly designed and divided, the buffer space is dynamically allocated to each system by the user software, and the clock switch of each buffer unit is automatically managed by the logic, which can reduce the overall area of the buffer and improve the buffer space. The utilization rate of the chip is reduced, and the power consumption of the chip is reduced, which has a high promotion value; its value is mainly reflected in the following aspects: (1) The cache area of the chip is effectively reduced, and the total cache capacity is reduced to 62.5%, while satisfying the vast majority of The demand for cache in the scene reduces the overall area of the chip, which is conducive to the miniaturized design of the chip, and further provides a foundation for the portability of the product; (2) Improve the utilization and uniformity of the cache design, for different subsystems Allocate different cache sizes and bandwidths, which effectively improves the utilization rate. The size of each cache unit is uniform, which also improves the simplicity of design and reduces the difficulty of back-end design; (3) Reduces cache power consumption and automatically monitors through logic Whether each cache unit is allocated for use, automatically turns on or off the clock of each cache unit, realizes refined management of power consumption, and effectively reduces the power consumption of the chip.

Claims (7)

  1. 一种共享缓存方法,其特征在于包括如下步骤:A shared cache method, comprising the steps of:
    S1.设置捕获子系统和若干个跟踪子系统共有的共享缓存区;S1. Set the shared buffer area shared by the capture subsystem and several tracking subsystems;
    S2.根据访问请求数量,对步骤S1得到的共享缓存区进行设计;具体为共有A路跟踪子系统和B路捕获子系统;每一路跟踪子系统有a1个写入请求,a2个读取请求,且每一路跟踪子系统的a1+a2个请求同时访问同一片缓存区;每一路捕获子系统有b1个写入请求,b2个读取请求,且每一路捕获子系统的b1+b2个请求分时访问同一片缓存区;共享缓存区间共设计C KB,并划分为D个缓存单元,每个缓存单元为E KB;A、B、a1、a2、b1、b2、C、D和E均为正整数,且E=C/D;S2. According to the number of access requests, design the shared buffer area obtained in step S1; specifically, there are A-way tracking subsystem and B-way capture subsystem; each tracking subsystem has a1 write requests and a2 read requests , and the a1+a2 requests of each tracking subsystem access the same buffer at the same time; each capture subsystem has b1 write requests, b2 read requests, and each capture subsystem has b1+b2 requests Time-sharing access to the same cache area; a total of C KB is designed in the shared cache area and divided into D cache units, each of which is E KB; A, B, a1, a2, b1, b2, C, D and E are all is a positive integer, and E=C/D;
    S3.根据步骤S2设计的共享缓存区,进行跟踪访问控制、捕获访问控制和缓存时钟控制。S3. According to the shared buffer area designed in step S2, trace access control, capture access control and cache clock control are performed.
  2. 根据权利要求1所述共享缓存方法,其特征在于步骤S2所述的根据访问请求数量,对步骤S1得到的共享缓存区进行设计,具体为共有8路跟踪子系统和1路捕获子系统;每一路跟踪子系统有1个写入请求,4个读取请求,且每一路跟踪子系统的5个请求同时访问同一片缓存区间;捕获子系统有1个写入请求,1个读取请求,且捕获子系统的2个请求分时访问同一片缓存区;共享缓存区间共设计640KB,并划分为40个缓存单元,每个缓存单元为16KB。The shared cache method according to claim 1, characterized in that the shared cache area obtained in step S1 is designed according to the number of access requests described in step S2, specifically, there are 8 tracking subsystems and 1 capturing subsystem; each A tracking subsystem has 1 write request and 4 read requests, and 5 requests of each tracking subsystem access the same cache area at the same time; the capture subsystem has 1 write request, 1 read request, And the two requests of the capture subsystem access the same cache area in a time-sharing manner; the shared cache area is designed to be 640KB in total, and is divided into 40 cache units, each of which is 16KB.
  3. 根据权利要求2所述的共享缓存方法,其特征在于步骤S3所述的跟踪访问控制,具体为采用如下步骤进行控制:The shared cache method according to claim 2, wherein the tracking access control described in step S3 is specifically controlled by the following steps:
    将跟踪访问控制分为控制流程控制、写流程控制和读流程控制;The tracking access control is divided into control flow control, write flow control and read flow control;
    对于控制流程控制:进行缓存空间地址的控制,并将系统时间窗分为若干个控制片段;For control flow control: control the cache space address, and divide the system time window into several control segments;
    对于写流程控制:控制采样点数据的拼接,并在最后一个控制片段的时隙将拼接的采样点数据写入缓存单元;For write flow control: control the splicing of sampling point data, and write the spliced sampling point data into the cache unit in the time slot of the last control segment;
    对于读流程控制:分为4个并行通道,4个并行通道相互独立工作,并满足4个相关器同时工作的采样点带宽;当某通道相关器发起读请求时,在对应的控制时隙内控制定时读缓存单元,对数据进行拆分后按顺序返回相关器。For read flow control: divided into 4 parallel channels, 4 parallel channels work independently of each other, and meet the sampling point bandwidth of 4 correlators working at the same time; when a channel correlator initiates a read request, in the corresponding control time slot Control the timing read cache unit, split the data and return it to the correlator in order.
  4. 根据权利要求2所述的共享缓存方法,其特征在于步骤S3所述的捕获访问控制,具体为采用如下步骤进行控制:The shared cache method according to claim 2, wherein the capturing access control described in step S3 is specifically controlled by the following steps:
    配置捕获子系统使用的缓存起始地址和空间容量,且保证不与跟踪子系统的缓存空间重叠;Configure the cache starting address and space capacity used by the capture subsystem, and ensure that it does not overlap with the cache space of the tracking subsystem;
    捕获采样点预处理后将数据写入捕获缓存,等待采集设定的采样点后,反复从捕获缓存读取数据进行计算,最终输出捕获结果并释放捕获缓存。After the capture sampling point is preprocessed, the data is written into the capture buffer. After waiting for the set sampling point to be collected, the data is repeatedly read from the capture buffer for calculation, and the capture result is finally output and the capture buffer is released.
  5. 根据权利要求2所述的共享缓存方法,其特征在于步骤S3所述的缓存时钟控制,具体为采用如下步骤进行控制:The shared cache method according to claim 2, wherein the cache clock control described in step S3 is specifically controlled by the following steps:
    单独配置每个缓存单元的时钟;Configure the clock of each cache unit individually;
    根据缓存单元的配置,动态切换每个缓存单元的时钟使能;According to the configuration of the cache unit, dynamically switch the clock enable of each cache unit;
    当某个缓存单元分配给某个子系统时,自动将该缓存单元的时钟打开;当该缓存单元释放后,自动关闭该缓存单元的时钟。When a cache unit is assigned to a subsystem, the clock of the cache unit is automatically turned on; when the cache unit is released, the clock of the cache unit is automatically turned off.
  6. 一种基带处理单元,其特征在于包括了权利要求1~5之一所述的共享缓存方法。A baseband processing unit is characterized in that it includes the shared cache method according to any one of claims 1 to 5.
  7. 一种芯片,其特征在于包括了权利要求6所述的基带处理单元。A chip is characterized by comprising the baseband processing unit of claim 6 .
PCT/CN2021/090998 2020-07-20 2021-04-29 Shared caching method, baseband processing unit, and chip thereof WO2022016946A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010695897.X 2020-07-20
CN202010695897.XA CN111737191B (en) 2020-07-20 2020-07-20 Shared cache method, baseband processing unit and chip thereof

Publications (1)

Publication Number Publication Date
WO2022016946A1 true WO2022016946A1 (en) 2022-01-27

Family

ID=72655030

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/090998 WO2022016946A1 (en) 2020-07-20 2021-04-29 Shared caching method, baseband processing unit, and chip thereof

Country Status (2)

Country Link
CN (1) CN111737191B (en)
WO (1) WO2022016946A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737191B (en) * 2020-07-20 2021-01-15 长沙海格北斗信息技术有限公司 Shared cache method, baseband processing unit and chip thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120200455A1 (en) * 2011-02-08 2012-08-09 Cambridge Silicon Radio Ltd. Use of gps to detect repetitive motion
CN105137460A (en) * 2015-08-27 2015-12-09 武汉梦芯科技有限公司 Satellite navigation system baseband signal processing system and method
CN105182377A (en) * 2015-08-21 2015-12-23 上海海积信息科技股份有限公司 Receiver board card and receiver
CN105807293A (en) * 2016-05-27 2016-07-27 重庆卓观科技有限公司 SOC (system on chip)-based single-board multi-antenna attitude-determining receiver
CN108761503A (en) * 2018-03-21 2018-11-06 青岛杰瑞自动化有限公司 A kind of multi-mode satellite signal acquisition methods and SOC chip
CN111737191A (en) * 2020-07-20 2020-10-02 长沙海格北斗信息技术有限公司 Shared cache method, baseband processing unit and chip thereof

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8073565B2 (en) * 2000-06-07 2011-12-06 Apple Inc. System and method for alerting a first mobile data processing system nearby a second mobile data processing system
US7999821B1 (en) * 2006-12-19 2011-08-16 Nvidia Corporation Reconfigurable dual texture pipeline with shared texture cache
US8587477B2 (en) * 2010-01-25 2013-11-19 Qualcomm Incorporated Analog front end for system simultaneously receiving GPS and GLONASS signals
US8259012B2 (en) * 2010-04-14 2012-09-04 The Boeing Company Software GNSS receiver for high-altitude spacecraft applications
CN102023302B (en) * 2010-12-17 2012-09-19 浙江大学 Multichannel cooperative control method and device in satellite navigation receiver
CN102053947B (en) * 2011-01-04 2012-07-04 东南大学 Method for realizing reconfiguration of global positioning system (GPS) baseband algorithm
US9009541B2 (en) * 2012-08-20 2015-04-14 Apple Inc. Efficient trace capture buffer management
CN105527631B (en) * 2014-11-26 2016-09-21 航天恒星科技有限公司 Weak signal processing method based on GNSS receiver
CN105866803A (en) * 2016-03-23 2016-08-17 沈阳航空航天大学 Baseband signal quick capturing algorithm for Beidou second-generation satellite navigation receiver based on FPGA
CN111272169A (en) * 2020-02-04 2020-06-12 中国科学院新疆天文台 Pulsar signal interference elimination device, system and method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120200455A1 (en) * 2011-02-08 2012-08-09 Cambridge Silicon Radio Ltd. Use of gps to detect repetitive motion
CN105182377A (en) * 2015-08-21 2015-12-23 上海海积信息科技股份有限公司 Receiver board card and receiver
CN105137460A (en) * 2015-08-27 2015-12-09 武汉梦芯科技有限公司 Satellite navigation system baseband signal processing system and method
CN105807293A (en) * 2016-05-27 2016-07-27 重庆卓观科技有限公司 SOC (system on chip)-based single-board multi-antenna attitude-determining receiver
CN108761503A (en) * 2018-03-21 2018-11-06 青岛杰瑞自动化有限公司 A kind of multi-mode satellite signal acquisition methods and SOC chip
CN111737191A (en) * 2020-07-20 2020-10-02 长沙海格北斗信息技术有限公司 Shared cache method, baseband processing unit and chip thereof

Also Published As

Publication number Publication date
CN111737191A (en) 2020-10-02
CN111737191B (en) 2021-01-15

Similar Documents

Publication Publication Date Title
US11797180B2 (en) Apparatus and method to provide cache move with non-volatile mass memory system
CN102298561B (en) A kind of mthods, systems and devices memory device being carried out to multi-channel data process
US6529416B2 (en) Parallel erase operations in memory systems
US20120079172A1 (en) Memory system
CN101241446A (en) Command scheduling method and apparatus of virtual file system embodied in nonvolatile data storage device
WO2022016946A1 (en) Shared caching method, baseband processing unit, and chip thereof
CN114356223B (en) Memory access method and device, chip and electronic equipment
US10754785B2 (en) Checkpointing for DRAM-less SSD
US10061709B2 (en) Systems and methods for accessing memory
US10162522B1 (en) Architecture of single channel memory controller to support high bandwidth memory of pseudo channel mode or legacy mode
KR20130009928A (en) Effective utilization of flash interface
US11126382B2 (en) SD card-based high-speed data storage method
US9378125B2 (en) Semiconductor chip and method of controlling memory
Chen et al. Delay-based I/O request scheduling in SSDs
WO2022095439A1 (en) Hardware acceleration system for data processing, and chip
CN111581136A (en) DMA controller and implementation method thereof
CN115480708A (en) Method for time division multiplexing local memory access
WO1994008307A1 (en) Multiplexed communication protocol between central and distributed peripherals in multiprocessor computer systems
CN100524357C (en) Data pre-fetching system in video processing
CN111694777B (en) DMA transmission method based on PCIe interface
CN113157602A (en) Method and device for distributing memory and computer readable storage medium
Chen et al. Design and Verification of High Performance Memory Interface Based on AXI Bus
CN115657950B (en) Data read-write processing method and device based on multiple channels and related equipment
Chen et al. Dynamically Reconfigurable Memory Address Mapping for General-Purpose Graphics Processing Unit
US20040064662A1 (en) Methods and apparatus for bus control in digital signal processors

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21845764

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21845764

Country of ref document: EP

Kind code of ref document: A1