CN114265812B - Method, device, equipment and medium for reducing access delay of RISC-V vector processor - Google Patents

Method, device, equipment and medium for reducing access delay of RISC-V vector processor Download PDF

Info

Publication number
CN114265812B
CN114265812B CN202111434561.9A CN202111434561A CN114265812B CN 114265812 B CN114265812 B CN 114265812B CN 202111434561 A CN202111434561 A CN 202111434561A CN 114265812 B CN114265812 B CN 114265812B
Authority
CN
China
Prior art keywords
data
stored
type
risc
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111434561.9A
Other languages
Chinese (zh)
Other versions
CN114265812A (en
Inventor
张贞雷
李拓
满宏涛
刘同强
周玉龙
邹晓峰
王贤坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Original Assignee
Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd filed Critical Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority to CN202111434561.9A priority Critical patent/CN114265812B/en
Publication of CN114265812A publication Critical patent/CN114265812A/en
Application granted granted Critical
Publication of CN114265812B publication Critical patent/CN114265812B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention provides a method, a device, equipment and a readable medium for reducing access delay of a RISC-V vector processor, wherein the method comprises the following steps: modifying the compiler of the RISC-V vector processor to add pre-stored instructions; responding to the data in the RISC-V vector processor for storage, and judging the type of the data to be stored through a pre-storage instruction; modifying information in a page table buffer according to the type of data to be stored; in response to receiving the data store instruction, the data to be stored is stored to the corresponding location based on the information in the modified page table buffer. By using the scheme of the invention, the access delay of the RISC-V vector processor can be greatly reduced, and the performance of the RISC-V vector processor can be improved.

Description

Method, device, equipment and medium for reducing access delay of RISC-V vector processor
Technical Field
The present invention relates to the field of computers, and more particularly to a method, apparatus, device, and readable medium for reducing access latency of a RISC-V vector processor.
Background
The traditional processor mainly refers to Intel and ARM, in the traditional fields of Pc, servers and the like, the Intel processor is in absolute monopoly, in the mobile platform and embedded field, the ARM is in core positioning, and the market occupancy rate is very high.
With the increasing international competition, the requirements of various industries on domestic processors are becoming more and more urgent, and RISC-V reads as RISC Five, which means a fifth generation of reduced instruction processors. The system is a brand new instruction set architecture, and the open source can be freely used by any academic institution or commercial organization, so that the system has the advantage of being independently controllable.
The RISC-V can support Vector processing by expanding Vector instruction set, so that the RISC-V can be used as a Vector processor (VFP), namely, efficient calculation is realized by expanding Vector instruction set, and the RISC-V can be effectively applied to machine learning, calculator vision, multimedia application and the like.
RISC-V as a vector processor involves a large number of data access operations, and conventional data storage mechanisms have become critical points limiting the performance of RISC-V vector processors. For example, for a Load vector instruction, 1024 32bits of data are loaded into the register, and the RISC-V Core determines whether the 1024 x 32bits of data are in the Cache, if not (not), the corresponding data need to be moved from the DDR (double rate synchronous dynamic random access memory) to the Cache, and then loaded from the Cache to the RISC-V Core. For a certain Store vector instruction, 1024 pieces of 32bits of data are stored in the Cache or DDR, the RISC-V core needs to judge whether the destination address of the 1024 x 32bits of writing is in the mapping of the Cache, if yes, the data is written in the Cache, and if not, the data is written in the DDR. In some cases large amounts of data from the store are not written to the Cache, but directly to the DDR, then the immediate load operation will require data to be fetched from the DDR, and in vector instructions, often large amounts of data, it is conceivable that significant data memory latency will be incurred, reducing the data processing performance of the RISC-V Core. In some cases, the data of the store is written into the Cache, but not into the DDR, and the Cache is occupied by data which cannot be used for a long time, and also causes great data access delay, so that the data processing performance of the RISC-V Core is reduced.
Disclosure of Invention
Accordingly, an objective of the embodiments of the present invention is to provide a method, apparatus, device and readable medium for reducing access delay of a RISC-V vector processor, which can greatly reduce the access delay of the RISC-V vector processor and improve performance of the RISC-V vector processor.
Based on the above objects, an aspect of an embodiment of the present invention provides a method for reducing access delay of a RISC-V vector processor, comprising the steps of:
modifying the compiler of the RISC-V vector processor to add pre-stored instructions;
responding to the data in the RISC-V vector processor for storage, and judging the type of the data to be stored through a pre-storage instruction;
modifying information in a page table buffer (Translation Lookaside Buffer, abbreviated TLB) according to the type of data to be stored;
in response to receiving the data store instruction, the data to be stored is stored to the corresponding location based on the information in the modified page table buffer.
According to one embodiment of the present invention, in response to data being stored in a RISC-V vector processor, determining the type of data to be stored via a pre-store instruction comprises:
and responding to the data in the RISC-V vector processor for storage, and dividing the data to be stored into a first type, a second type and a third type according to the information of the data to be stored, wherein the first type is that the data to be stored can be loaded in a first preset time, the second type is that the data to be stored cannot be loaded in a second preset time, and the third type is conventional data.
According to one embodiment of the invention, modifying the information in the page table buffer according to the type of data to be stored includes:
responding to the type of the data to be stored as a first type, and modifying the mapping of the virtual address and the physical address stored in the page table buffer into the address corresponding to the storage instruction so as to write the data into the cache;
in response to the type of the data to be stored being the second type, modifying the mapping of the virtual address and the physical address stored in the page table buffer to an address which does not correspond to the storage instruction, so that the data is written into the memory;
in response to the type of data to be stored being the third type, the information in the page table buffer is not modified.
According to one embodiment of the invention, the pre-stored instruction has the same storage destination address, storage length, and storage width as the stored instruction.
In another aspect of an embodiment of the present invention, there is also provided an apparatus for reducing access latency of a RISC-V vector processor, the apparatus comprising:
an add module configured to modify a compiler of the RISC-V vector processor to add pre-stored instructions;
the judging module is configured to respond to the fact that data are stored in the RISC-V vector processor, and judges the type of the data to be stored through a pre-storage instruction;
the modifying module is configured to modify information in the page table buffer according to the type of the data to be stored;
and the storage module is configured to store the data to be stored to the corresponding position based on the information in the modified page table buffer in response to receiving the data storage instruction.
According to one embodiment of the invention, the judgment module is further configured to:
and responding to the data in the RISC-V vector processor for storage, and dividing the data to be stored into a first type, a second type and a third type according to the information of the data to be stored, wherein the first type is that the data to be stored can be loaded in a first preset time, the second type is that the data to be stored cannot be loaded in a second preset time, and the third type is conventional data.
According to one embodiment of the invention, the modification module is further configured to:
responding to the type of the data to be stored as a first type, and modifying the mapping of the virtual address and the physical address stored in the page table buffer into the address corresponding to the storage instruction so as to write the data into the cache;
in response to the type of the data to be stored being the second type, modifying the mapping of the virtual address and the physical address stored in the page table buffer to an address which does not correspond to the storage instruction, so that the data is written into the memory;
in response to the type of data to be stored being the third type, the information in the page table buffer is not modified.
According to one embodiment of the invention, the pre-stored instruction has the same storage destination address, storage length, and storage width as the stored instruction.
In another aspect of the embodiments of the present invention, there is also provided a computer apparatus including:
at least one processor; and
and a memory storing computer instructions executable on the processor, the instructions when executed by the processor performing the steps of any of the methods described above.
In another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of any of the methods described above.
The invention has the following beneficial technical effects: the method for reducing access delay of the RISC-V vector processor provided by the embodiment of the invention increases the pre-stored instruction by modifying the compiler of the RISC-V vector processor; responding to the data in the RISC-V vector processor for storage, and judging the type of the data to be stored through a pre-storage instruction; modifying information in a page table buffer according to the type of data to be stored; in response to receiving the data storage instruction, the technical scheme of storing the data to be stored to the corresponding position based on the information in the modified page table buffer can greatly reduce the access delay of the RISC-V vector processor and improve the performance of the RISC-V vector processor.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are necessary for the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention and that other embodiments may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow diagram of a method of reducing RISC-V vector processor access latency in accordance with one embodiment of the present invention;
FIG. 2 is a schematic diagram of an apparatus for reducing access latency of a RISC-V vector processor according to one embodiment of the present invention;
FIG. 3 is a schematic diagram of a computer device according to one embodiment of the invention;
fig. 4 is a schematic diagram of a computer-readable storage medium according to one embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
With the above object in mind, in a first aspect, embodiments of the present invention provide an embodiment of a method for reducing access latency of a RISC-V vector processor. Fig. 1 shows a schematic flow chart of the method.
As shown in fig. 1, the method may include the steps of:
s1 modifies the compiler of the RISC-V vector processor to add pre-stored instructions.
The method of the invention utilizes the characteristic that RISC-V is open and easy to modify, by modifying a compiler, a data Pre-storage instruction (Vector Pre_Store) is newly added, the instruction is executed before a real Vector Store instruction, the data of the following Vector Store is identified in the instruction and can not be read out quickly, the data length of the Store instruction is identified, and the operation on the TLB and the Cache is determined according to the type of the Store. In this instruction, it has the same Store destination address, store length, store width (length and width determine the amount of data of Store) as the real Vector Store instruction, and at the same time it needs to contain the type of Vector Store, which requires the compiler to have overall knowledge of the Load/Store application at the overall level.
S2, responding to data in the RISC-V vector processor to store, and judging the type of the data to be stored through a pre-storage instruction.
The data to be stored may be classified into a first type, a second type and a third type according to information of the data to be stored, wherein the first type is that the data to be stored is loaded within a first preset time (i.e. the stored data is to be loaded soon), the second type is that the data to be stored is not loaded within a second preset time (i.e. the stored data is not to be loaded for a long time), and the third type is conventional data.
S3, modifying information in the page table buffer according to the type of the data to be stored.
If the type of the pre_store instruction indicates that the next Store is of the first type, the TLB is updated such that the virtual address stored in the TLB corresponds to the destination address of the next Vector Store, such that data of the Store is directly written to the Cache when the Vector Store instruction is actually executed. If the type of the pre_store instruction indicates that the next Store is of the second type, the TLB is updated such that the mapping of virtual and physical addresses stored in the TLB is not the corresponding destination address of the next Vector Store, such that the Store's data is written directly into the DDR, bypassing the Cache. If the type of Pre Store instruction indicates that the next Store is of the third type, then a normal load operation is performed.
S4, responding to the received data storage instruction, and storing the data to be stored to the corresponding position based on the information in the modified page table buffer.
When executing the Vector Store instruction, the stored data is written into the Cache or the DDR according to the mapping relationship in the TLB, that is, if the data is the first type data, the Store data is written into the Cache, and of course, all the data of the Store under the condition that the Cache capacity exists in an extreme case, at this time, the remaining data is written into the DDR, which is flexible, and can be used for the Vector Store according to a specific application scenario, for example, the Cache capacity of 1/2 is reserved. If the data is the second type data, the memory data is written into the DDR after the modification of the TLB by the Vector Pre_memory. If it is a third type of data, then the TLB setting is set according to default. At this time, when the first type data is loaded again, the data is directly loaded from the Cache, and when the second type data is loaded again, the data is required to be read from the DDR, and then the data is loaded from the DDR to the Cache.
The method of the invention can ensure that after the execution of the Vector Store instruction, whether the data are accessed again in a short time or not accessed for a long time, the access delay of the RISC-V processor can be reduced, the performance of the RISC-V processor is improved, and the defects that under the traditional scheme, the Vector Store data are written into the DDR, the Vector load is immediately written into the RISC-VCore (the data are loaded from the DDR) or the data of the Vector Store are written into the Cache, but the Cache use efficiency is low, the access delay is large and the performance of the RISC-V processor is low because the data are not accessed again for a long time are overcome.
Meanwhile, during the period from the data Pre-storage instruction (Vector Pre-Store) to the real data storage instruction (Vector Store), the RISC-V Core does not access the data Cache, because the common instruction does not load data from the data Cache, but fetches the data from the common register, so that the execution process of the data Pre-storage instruction (Vector Pre-Store) does not influence the execution of the common instruction, and the execution of the common instruction and the Pre-storage process of the data are equivalent to parallel execution.
By the technical scheme of the invention, the access delay of the RISC-V vector processor can be greatly reduced, and the performance of the RISC-V vector processor can be improved.
In a preferred embodiment of the present invention, responsive to the data being stored in the RISC-V vector processor, determining the type of data to be stored via the pre-store instruction comprises:
and responding to the data in the RISC-V vector processor for storage, and dividing the data to be stored into a first type, a second type and a third type according to the information of the data to be stored, wherein the first type is that the data to be stored can be loaded in a first preset time, the second type is that the data to be stored cannot be loaded in a second preset time, and the third type is conventional data. The first type of data is to be loaded in a short time, the second type of data is not to be loaded in a long time, the third type of data is regular data, and regular operations are performed.
In a preferred embodiment of the invention, modifying the information in the page table buffer according to the type of data to be stored comprises:
responding to the type of the data to be stored as a first type, and modifying the mapping of the virtual address and the physical address stored in the page table buffer into the address corresponding to the storage instruction so as to write the data into the cache;
in response to the type of the data to be stored being the second type, modifying the mapping of the virtual address and the physical address stored in the page table buffer to an address which does not correspond to the storage instruction, so that the data is written into the memory;
in response to the type of data to be stored being the third type, the information in the page table buffer is not modified.
In a preferred embodiment of the present invention, the pre-stored instruction has the same storage destination address, storage length, and storage width as the stored instruction.
By the technical scheme of the invention, the access delay of the RISC-V vector processor can be greatly reduced, and the performance of the RISC-V vector processor can be improved.
It should be noted that, it will be understood by those skilled in the art that all or part of the procedures in implementing the methods of the above embodiments may be implemented by a computer program to instruct related hardware, and the above program may be stored in a computer readable storage medium, and the program may include the procedures of the embodiments of the above methods when executed. Wherein the storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like. The computer program embodiments described above may achieve the same or similar effects as any of the method embodiments described above.
Furthermore, the method disclosed according to the embodiment of the present invention may also be implemented as a computer program executed by a CPU, which may be stored in a computer-readable storage medium. When executed by a CPU, performs the functions defined above in the methods disclosed in the embodiments of the present invention.
With the above object in mind, in a second aspect of the embodiments of the present invention, there is provided an apparatus for reducing access delay of a RISC-V vector processor, as shown in fig. 2, an apparatus 200 includes:
an add module configured to modify a compiler of the RISC-V vector processor to add pre-stored instructions;
the judging module is configured to respond to the fact that data are stored in the RISC-V vector processor, and judges the type of the data to be stored through a pre-storage instruction;
the modifying module is configured to modify information in the page table buffer according to the type of the data to be stored;
and the storage module is configured to store the data to be stored to the corresponding position based on the information in the modified page table buffer in response to receiving the data storage instruction.
In a preferred embodiment of the present invention, the judgment module is further configured to:
and responding to the data in the RISC-V vector processor for storage, and dividing the data to be stored into a first type, a second type and a third type according to the information of the data to be stored, wherein the first type is that the data to be stored can be loaded in a first preset time, the second type is that the data to be stored cannot be loaded in a second preset time, and the third type is conventional data.
In a preferred embodiment of the invention, the modification module is further configured to:
responding to the type of the data to be stored as a first type, and modifying the mapping of the virtual address and the physical address stored in the page table buffer into the address corresponding to the storage instruction so as to write the data into the cache;
in response to the type of the data to be stored being the second type, modifying the mapping of the virtual address and the physical address stored in the page table buffer to an address which does not correspond to the storage instruction, so that the data is written into the memory;
in response to the type of data to be stored being the third type, the information in the page table buffer is not modified.
In a preferred embodiment of the present invention, the pre-stored instruction has the same storage destination address, storage length, and storage width as the stored instruction.
Based on the above object, a third aspect of the embodiments of the present invention proposes a computer device. Fig. 3 is a schematic diagram of an embodiment of a computer device provided by the present invention. As shown in fig. 3, an embodiment of the present invention includes the following means: at least one processor 21; and a memory 22, the memory 22 storing computer instructions 23 executable on the processor, the instructions when executed by the processor performing the method of:
modifying the compiler of the RISC-V vector processor to add pre-stored instructions;
responding to the data in the RISC-V vector processor for storage, and judging the type of the data to be stored through a pre-storage instruction;
modifying information in a page table buffer according to the type of data to be stored;
in response to receiving the data store instruction, the data to be stored is stored to the corresponding location based on the information in the modified page table buffer.
In a preferred embodiment of the present invention, responsive to the data being stored in the RISC-V vector processor, determining the type of data to be stored via the pre-store instruction comprises:
and responding to the data in the RISC-V vector processor for storage, and dividing the data to be stored into a first type, a second type and a third type according to the information of the data to be stored, wherein the first type is that the data to be stored can be loaded in a first preset time, the second type is that the data to be stored cannot be loaded in a second preset time, and the third type is conventional data.
In a preferred embodiment of the invention, modifying the information in the page table buffer according to the type of data to be stored comprises:
responding to the type of the data to be stored as a first type, and modifying the mapping of the virtual address and the physical address stored in the page table buffer into the address corresponding to the storage instruction so as to write the data into the cache;
in response to the type of the data to be stored being the second type, modifying the mapping of the virtual address and the physical address stored in the page table buffer to an address which does not correspond to the storage instruction, so that the data is written into the memory;
in response to the type of data to be stored being the third type, the information in the page table buffer is not modified.
In a preferred embodiment of the present invention, the pre-stored instruction has the same storage destination address, storage length, and storage width as the stored instruction.
Based on the above object, a fourth aspect of the embodiments of the present invention proposes a computer-readable storage medium. Fig. 4 is a schematic diagram of an embodiment of a computer-readable storage medium provided by the present invention. As shown in fig. 4, the computer-readable storage medium 31 stores a computer program 32 that, when executed by a processor, performs the following method:
modifying the compiler of the RISC-V vector processor to add pre-stored instructions;
responding to the data in the RISC-V vector processor for storage, and judging the type of the data to be stored through a pre-storage instruction;
modifying information in a page table buffer according to the type of data to be stored;
in response to receiving the data store instruction, the data to be stored is stored to the corresponding location based on the information in the modified page table buffer.
In a preferred embodiment of the present invention, responsive to the data being stored in the RISC-V vector processor, determining the type of data to be stored via the pre-store instruction comprises:
and responding to the data in the RISC-V vector processor for storage, and dividing the data to be stored into a first type, a second type and a third type according to the information of the data to be stored, wherein the first type is that the data to be stored can be loaded in a first preset time, the second type is that the data to be stored cannot be loaded in a second preset time, and the third type is conventional data.
In a preferred embodiment of the invention, modifying the information in the page table buffer according to the type of data to be stored comprises:
responding to the type of the data to be stored as a first type, and modifying the mapping of the virtual address and the physical address stored in the page table buffer into the address corresponding to the storage instruction so as to write the data into the cache;
in response to the type of the data to be stored being the second type, modifying the mapping of the virtual address and the physical address stored in the page table buffer to an address which does not correspond to the storage instruction, so that the data is written into the memory;
in response to the type of data to be stored being the third type, the information in the page table buffer is not modified.
In a preferred embodiment of the present invention, the pre-stored instruction has the same storage destination address, storage length, and storage width as the stored instruction.
Furthermore, the method disclosed according to the embodiment of the present invention may also be implemented as a computer program executed by a processor, which may be stored in a computer-readable storage medium. The above-described functions defined in the methods disclosed in the embodiments of the present invention are performed when the computer program is executed by a processor.
Furthermore, the above-described method steps and system units may also be implemented using a controller and a computer-readable storage medium storing a computer program for causing the controller to implement the above-described steps or unit functions.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
In one or more exemplary designs, the functions may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a general purpose or special purpose computer or general purpose or special purpose processor. Further, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, digital Versatile Disc (DVD), floppy disk, blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It should be understood that as used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.
The foregoing embodiment of the present invention has been disclosed with reference to the number of embodiments for the purpose of description only, and does not represent the advantages or disadvantages of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, and the program may be stored in a computer readable storage medium, where the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Those of ordinary skill in the art will appreciate that: the above discussion of any embodiment is merely exemplary and is not intended to imply that the scope of the disclosure of embodiments of the invention, including the claims, is limited to such examples; combinations of features of the above embodiments or in different embodiments are also possible within the idea of an embodiment of the invention, and many other variations of the different aspects of the embodiments of the invention as described above exist, which are not provided in detail for the sake of brevity. Therefore, any omission, modification, equivalent replacement, improvement, etc. of the embodiments should be included in the protection scope of the embodiments of the present invention.

Claims (6)

1. A method for reducing access delay of a RISC-V vector processor, comprising the steps of:
modifying the compiler of the RISC-V vector processor to add pre-stored instructions;
judging the type of data to be stored through a pre-storage instruction in response to the fact that the data are stored in the RISC-V vector processor, wherein judging the type of the data to be stored through the pre-storage instruction comprises classifying the data to be stored into a first type, a second type and a third type according to the information of the data to be stored in response to the fact that the data are stored in the RISC-V vector processor, wherein the first type is that the data to be stored are loaded in a first preset time, the second type is that the data to be stored are not loaded in a second preset time, and the third type is conventional data;
modifying information in the page table buffer according to the type of the data to be stored, wherein modifying the information in the page table buffer according to the type of the data to be stored comprises modifying the mapping of the virtual address and the physical address stored in the page table buffer to the address corresponding to the storage instruction in response to the type of the data to be stored being the first type so as to enable the data to be written into the buffer, modifying the mapping of the virtual address and the physical address stored in the page table buffer to the address not corresponding to the storage instruction in response to the type of the data to be stored being the second type so as to enable the data to be written into the memory, and not modifying the information in the page table buffer in response to the type of the data to be stored being the third type;
in response to receiving the data store instruction, the data to be stored is stored to the corresponding location based on the information in the modified page table buffer.
2. The method of claim 1, wherein the pre-stored instruction has the same storage destination address, storage length, and storage width as the stored instruction.
3. An apparatus for reducing access latency of a RISC-V vector processor, the apparatus comprising:
an add module configured to modify a compiler of a RISC-V vector processor to add pre-stored instructions;
the judging module is configured to respond to the fact that data are stored in the RISC-V vector processor, judge the type of the data to be stored through a pre-storage instruction, and further configured to respond to the fact that the data are stored in the RISC-V vector processor, divide the data to be stored into a first type, a second type and a third type according to the information of the data to be stored, wherein the first type is that the data to be stored are loaded in a first preset time, the second type is that the data to be stored are not loaded in a second preset time, and the third type is conventional data;
a modifying module configured to modify information in a page table buffer according to a type of data to be stored, the modifying module being further configured to modify a mapping of virtual addresses and physical addresses stored in the page table buffer to addresses corresponding to storage instructions in response to the type of data to be stored being the first type so as to write the data into the buffer, to modify a mapping of virtual addresses and physical addresses stored in the page table buffer to addresses not corresponding to storage instructions in response to the type of data to be stored being the second type so as to write the data into the memory, and to not modify information in the page table buffer in response to the type of data to be stored being the third type;
and a storage module configured to store data to be stored to a corresponding location based on information in the modified page table buffer in response to receiving the data storage instruction.
4. The apparatus of claim 3, wherein the pre-stored instruction has the same storage destination address, storage length, and storage width as the stored instruction.
5. A computer device, comprising:
at least one processor; and
a memory storing computer instructions executable on the processor, which when executed by the processor, perform the steps of the method of any one of claims 1-2.
6. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method of any one of claims 1-2.
CN202111434561.9A 2021-11-29 2021-11-29 Method, device, equipment and medium for reducing access delay of RISC-V vector processor Active CN114265812B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111434561.9A CN114265812B (en) 2021-11-29 2021-11-29 Method, device, equipment and medium for reducing access delay of RISC-V vector processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111434561.9A CN114265812B (en) 2021-11-29 2021-11-29 Method, device, equipment and medium for reducing access delay of RISC-V vector processor

Publications (2)

Publication Number Publication Date
CN114265812A CN114265812A (en) 2022-04-01
CN114265812B true CN114265812B (en) 2024-02-02

Family

ID=80825780

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111434561.9A Active CN114265812B (en) 2021-11-29 2021-11-29 Method, device, equipment and medium for reducing access delay of RISC-V vector processor

Country Status (1)

Country Link
CN (1) CN114265812B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101187859A (en) * 2006-11-17 2008-05-28 上海高性能集成电路设计中心 Data stream prefetching method based on access instruction
CN104239225A (en) * 2014-09-04 2014-12-24 浪潮(北京)电子信息产业有限公司 Method and device for managing heterogeneous hybrid memory
CN104252392A (en) * 2013-06-28 2014-12-31 华为技术有限公司 Method for accessing data cache and processor
CN107250993A (en) * 2015-02-23 2017-10-13 英特尔公司 Vectorial cache lines write back processor, method, system and instruction
CN111177066A (en) * 2019-12-29 2020-05-19 苏州浪潮智能科技有限公司 Method, device and medium for improving efficiency of accessing off-chip memory
CN112306910A (en) * 2019-07-31 2021-02-02 英特尔公司 Hardware for split data conversion look-aside buffer
CN112463657A (en) * 2019-09-09 2021-03-09 阿里巴巴集团控股有限公司 Processing method and processing device for address translation cache clearing instruction
CN112463074A (en) * 2020-12-14 2021-03-09 苏州浪潮智能科技有限公司 Data classification storage method, system, terminal and storage medium
US11023375B1 (en) * 2020-02-21 2021-06-01 SiFive, Inc. Data cache with hybrid writeback and writethrough
WO2021178493A1 (en) * 2020-03-03 2021-09-10 Dover Microsystems, Inc. Systems and methods for caching metadata

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9311094B2 (en) * 2011-01-21 2016-04-12 Apple Inc. Predicting a pattern in addresses for a memory-accessing instruction when processing vector instructions
US20140281116A1 (en) * 2013-03-15 2014-09-18 Soft Machines, Inc. Method and Apparatus to Speed up the Load Access and Data Return Speed Path Using Early Lower Address Bits
US20170286114A1 (en) * 2016-04-02 2017-10-05 Intel Corporation Processors, methods, and systems to allocate load and store buffers based on instruction type
US10409603B2 (en) * 2016-12-30 2019-09-10 Intel Corporation Processors, methods, systems, and instructions to check and store indications of whether memory addresses are in persistent memory
US11494311B2 (en) * 2019-09-17 2022-11-08 Micron Technology, Inc. Page table hooks to memory types

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101187859A (en) * 2006-11-17 2008-05-28 上海高性能集成电路设计中心 Data stream prefetching method based on access instruction
CN104252392A (en) * 2013-06-28 2014-12-31 华为技术有限公司 Method for accessing data cache and processor
CN104239225A (en) * 2014-09-04 2014-12-24 浪潮(北京)电子信息产业有限公司 Method and device for managing heterogeneous hybrid memory
CN107250993A (en) * 2015-02-23 2017-10-13 英特尔公司 Vectorial cache lines write back processor, method, system and instruction
CN112306910A (en) * 2019-07-31 2021-02-02 英特尔公司 Hardware for split data conversion look-aside buffer
CN112463657A (en) * 2019-09-09 2021-03-09 阿里巴巴集团控股有限公司 Processing method and processing device for address translation cache clearing instruction
CN111177066A (en) * 2019-12-29 2020-05-19 苏州浪潮智能科技有限公司 Method, device and medium for improving efficiency of accessing off-chip memory
US11023375B1 (en) * 2020-02-21 2021-06-01 SiFive, Inc. Data cache with hybrid writeback and writethrough
WO2021178493A1 (en) * 2020-03-03 2021-09-10 Dover Microsystems, Inc. Systems and methods for caching metadata
CN112463074A (en) * 2020-12-14 2021-03-09 苏州浪潮智能科技有限公司 Data classification storage method, system, terminal and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
通用处理器的高带宽访存流水线研究;张浩;林伟;周永彬;叶笑春;范东睿;;计算机学报(第01期);第144-153页 *

Also Published As

Publication number Publication date
CN114265812A (en) 2022-04-01

Similar Documents

Publication Publication Date Title
US8745334B2 (en) Sectored cache replacement algorithm for reducing memory writebacks
US11593117B2 (en) Combining load or store instructions
WO2016099664A1 (en) Apparatus, system and method for caching compressed data background
US20130036426A1 (en) Information processing device and task switching method
US20210173789A1 (en) System and method for storing cache location information for cache entry transfer
JP2007048296A (en) Method, apparatus and system for invalidating multiple address cache entries
JP2020519991A (en) Apparatus and method for managing capability metadata
CN113641596A (en) Cache management method, cache management device and processor
JP2008047124A (en) Method and unit for processing computer graphics data
US8028118B2 (en) Using an index value located on a page table to index page attributes
CN102096562A (en) Data writing method and device
US8966186B2 (en) Cache memory prefetching
JP4113524B2 (en) Cache memory system and control method thereof
TW201729100A (en) Memory apparatus and data accessing method thereof
JP4666511B2 (en) Memory caching in data processing
JP5607603B2 (en) Method, apparatus, and computer program for cache management
CN102521161B (en) Data caching method, device and server
CN114265812B (en) Method, device, equipment and medium for reducing access delay of RISC-V vector processor
CN102037448A (en) Device emulation support within a host data processing apparatus
CN110716887B (en) Hardware cache data loading method supporting write hint
CN108519860B (en) SSD read hit processing method and device
US20070079109A1 (en) Simulation apparatus and simulation method
CN108268380A (en) A kind of method and apparatus for reading and writing data
CN112559389A (en) Storage control device, processing device, computer system, and storage control method
CN116382582A (en) RAID remapping method, system, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant