CN113590508B - Dynamic reconfigurable memory address mapping method and device - Google Patents

Dynamic reconfigurable memory address mapping method and device Download PDF

Info

Publication number
CN113590508B
CN113590508B CN202111155689.1A CN202111155689A CN113590508B CN 113590508 B CN113590508 B CN 113590508B CN 202111155689 A CN202111155689 A CN 202111155689A CN 113590508 B CN113590508 B CN 113590508B
Authority
CN
China
Prior art keywords
memory
target application
address mapping
memory address
chip
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111155689.1A
Other languages
Chinese (zh)
Other versions
CN113590508A (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Muxi Technology Beijing Co ltd
Original Assignee
Muxi Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Muxi Technology Beijing Co ltd filed Critical Muxi Technology Beijing Co ltd
Priority to CN202111155689.1A priority Critical patent/CN113590508B/en
Publication of CN113590508A publication Critical patent/CN113590508A/en
Application granted granted Critical
Publication of CN113590508B publication Critical patent/CN113590508B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1009Address translation using page tables, e.g. page table structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System (AREA)

Abstract

The invention provides a dynamic reconfigurable memory address mapping method and device, and relates to a chip technology.A memory address mapping relation of a target application is generated based on a configuration parameter and a memory concurrent access mode of the target application by acquiring the configuration parameter of a chip and the memory concurrent access mode of the target application; receiving an execution request of a user for the target application, and calling the memory address mapping relation corresponding to the target application according to the execution request; the technical scheme of dynamically configuring the memory subsystem of the chip according to the memory address mapping relation can be combined with specific target application to generate the corresponding memory address mapping relation so as to adapt to more complex memory access modes in application scenes such as artificial intelligence, high-performance calculation and the like.

Description

Dynamic reconfigurable memory address mapping method and device
Technical Field
The present invention relates to chip technologies, and in particular, to a dynamically reconfigurable memory address mapping method and apparatus.
Background
High throughput computing chips such as GPU and AI chips need to be supported by a high bandwidth memory subsystem. The existing memory system has extremely high theoretical bandwidth. However, the applications actually running on the chip are subject to resource contention of the memory subsystem and contention among concurrent computing units, so that it is difficult to achieve the nominal theoretical bandwidth of the memory system, and extra power consumption overhead may be caused by improper resource usage.
In the prior art, in order to solve resource contention, a concurrent memory access request is dispersed to a memory resource capable of processing a plurality of memory requests in parallel by remapping a memory address.
However, in the prior art, the access and storage modes of applications running on a high-throughput computing chip such as a GPU and an AI are complex and changeable, and parameters of a memory subsystem of the application also change with different configurations (such as virtualization) of the chip, so that it is difficult to meet the requirements of all scenes with a single address mapping function.
Disclosure of Invention
The embodiment of the invention provides a dynamic reconfigurable memory address mapping method and device, which can be combined with chip configuration parameters and target application to generate a corresponding memory address mapping relation so as to adapt to more complex memory access modes in application scenes such as artificial intelligence and high-performance calculation.
In a first aspect of the embodiments of the present invention, a dynamically reconfigurable memory address mapping method is provided, including:
acquiring configuration parameters of a chip and a memory concurrent access mode of a target application, and generating a memory address mapping relation of the target application based on the configuration parameters and the memory concurrent access mode;
receiving an execution request of a user for the target application, and calling the memory address mapping relation corresponding to the target application according to the execution request;
and dynamically configuring the memory subsystem of the chip according to the memory address mapping relation.
Optionally, in a possible implementation manner of the first aspect, the obtaining a memory concurrent access mode of a target application includes:
and acquiring code information of the target application, and acquiring a memory concurrent access mode of the target application according to the code information.
Optionally, in a possible implementation manner of the first aspect, the obtaining a memory concurrent access mode of a target application includes:
and acquiring the bit turning rate of the memory access stream of the target application, and acquiring the memory concurrent access mode of the target application based on the bit turning rate.
Optionally, in a possible implementation manner of the first aspect, the obtaining a bit flipping rate of a memory access stream of the target application includes:
means for recording the bit flip rate according to simulator or said chip operation;
and acquiring the bit flipping rate of the memory access stream of the target application based on the device. And acquiring the bit flipping rate of the memory access stream of the target application in a preset time period.
Optionally, in a possible implementation manner of the first aspect, the obtaining configuration parameters of a chip includes:
and acquiring the architecture parameters, the memory system parameters and the scheduling strategy of the chip.
Optionally, in a possible implementation manner of the first aspect, after generating the memory address mapping relationship of the target application based on the configuration parameter and the memory concurrent access mode, the method further includes:
and binding the target application and the memory address mapping relation based on a preset position.
Optionally, in a possible implementation manner of the first aspect, the preset position includes:
and the position of the metadata of the executable file corresponding to the target application or the position of the page table entry corresponding to each data block in the target application.
In a second aspect of the embodiments of the present invention, a dynamically reconfigurable memory address mapping apparatus is provided, including:
the mapping module is used for acquiring configuration parameters of a chip and a memory concurrent access mode of a target application, and generating a memory address mapping relation of the target application based on the configuration parameters and the memory concurrent access mode;
the calling module is used for receiving an execution request of a user for the target application and calling the memory address mapping relation corresponding to the target application according to the execution request;
and the execution module is used for dynamically configuring the memory subsystem of the chip according to the memory address mapping relation.
In a third aspect of the embodiments of the present invention, a dynamically reconfigurable memory address mapping device is provided, including: memory, a processor and a computer program, the computer program being stored in the memory, the processor running the computer program to perform the method of the first aspect of the invention as well as various possible aspects of the first aspect.
A fourth aspect of the embodiments of the present invention provides a readable storage medium, in which a computer program is stored, the computer program being, when executed by a processor, configured to implement the method according to the first aspect of the present invention and various possible aspects of the first aspect.
According to the dynamic reconfigurable memory address mapping method and device provided by the invention, the memory address mapping relation of the target application is generated through the configuration parameters of the chip and the memory concurrent access mode of the target application, and when the target application is executed subsequently, the memory subsystem of the chip is dynamically configured according to the memory address mapping relation, so that the corresponding memory address mapping relation can be generated by combining with the specific target application, and the dynamic reconfigurable memory address mapping method and device are suitable for more complex memory access modes in application scenes such as artificial intelligence, high-performance calculation and the like.
Drawings
Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present invention.
Fig. 2 is a schematic flowchart of a dynamically reconfigurable memory address mapping method according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of the same application column-first concurrent thread scheduling policy and row-first concurrent thread scheduling policy according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of a high-throughput computing on-chip memory concurrent access mode affected by a computing unit scheduling policy according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of an apparatus for recording a bit flip rate according to an embodiment of the present invention.
Fig. 6 is a schematic structural diagram of a dynamically reconfigurable memory address mapping apparatus according to an embodiment of the present invention.
Fig. 7 is a dynamically reconfigurable memory address mapping apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein.
It should be understood that, in various embodiments of the present invention, the sequence numbers of the processes do not mean the execution sequence, and the execution sequence of the processes should be determined by the functions and the internal logic of the processes, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
It should be understood that in the present application, "comprising" and "having" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that, in the present invention, "a plurality" means two or more. "and/or" is merely an association describing an associated object, meaning that three relationships may exist, for example, and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "comprises A, B and C" and "comprises A, B, C" means that all three of A, B, C comprise, "comprises A, B or C" means that one of A, B, C comprises, "comprises A, B and/or C" means that any 1 or any 2 or 3 of A, B, C comprises.
It should be understood that in the present invention, "B corresponding to a", "a corresponds to B", or "B corresponds to a" means that B is associated with a, and B can be determined from a. Determining B from a does not mean determining B from a alone, but may be determined from a and/or other information. And the matching of A and B means that the similarity of A and B is greater than or equal to a preset threshold value.
As used herein, "if" may be interpreted as "at … …" or "when … …" or "in response to a determination" or "in response to a detection", depending on the context.
The technical solution of the present invention will be described in detail below with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.
Referring to fig. 1, a schematic diagram of an application scenario provided by the embodiment of the present invention is a DRAM memory subsystem in the prior art, which shows a 4-level organization structure of a common DRAM: channel (Channel), cluster (Bank), Row (Row), Column (Column). The DRAM system illustrated in the figure has 32 channels, 16 clusters per channel. The specific bit field in the address of the memory access indicates the serial number of the channel, cluster, row, and column to be accessed. In the existing memory system, the number of resources in each layer may be different from the above example, and more design layers may be introduced, which brings more complicated resource competition situations, for example, HBM2 adds a bank group layer between clusters and channels. To fully utilize the theoretical bandwidth provided by a memory system, the compute unit needs to generate enough parallel memory requests that can be processed concurrently. However, in actual implementation, taking the DRAM system of fig. 1 as an example, two parallel memory accesses from the compute units may not be concurrently processed by the DRAM memory system due to contention for I/O buses of the same channel, line caches of the same cluster, and so on. Therefore, concurrent memory requests need to access different channels or different clusters of the same channel as much as possible, and therefore bit segments corresponding to the channels and clusters in the memory addresses accessed by the computing unit need to be changed as much as possible.
Concurrent memory access modes of most applications are simple in applications running on a traditional computing chip such as a CPU. For applications with large memory level parallelism, the memory access addresses of the compute units are typically accumulated sequentially, typically while traversing an array. Therefore, in order to efficiently utilize the DRAM memory subsystem of fig. 1, a typical CPU may employ a memory mapping scheme (i.e., the memory address mapping scheme shown in fig. 1), specifically, the frequency of the change of the highest bit segment in the sequentially accumulated memory access mode may be the lowest, and the highest bit segment is defined as a row number, so as to reduce the performance of row cache update and power consumption overhead; the frequency of the change of the middle bit section is higher, the change is defined as the serial number of a channel and a cluster, and the parallelism of the memory subsystem is utilized as much as possible; the lowest bit segment has the highest frequency of change and is defined as a column number, so that the locality of the line cache is fully utilized.
However, in the prior art, the access and storage modes of applications running on a high-throughput computing chip such as a GPU and an AI are complex and changeable, and parameters of a memory subsystem of the application also change with different configurations (such as virtualization) of the chip, so that it is difficult to meet the requirements of all scenes with a single address mapping function.
To solve the above technical problem, referring to fig. 2, a flowchart of a dynamically reconfigurable memory address mapping method according to an embodiment of the present invention is shown, and an execution main body of the method shown in fig. 2 may be a software and/or hardware device. The execution subject of the present application may include, but is not limited to, at least one of: user equipment, network equipment, etc. The user equipment may include, but is not limited to, a computer, a smart phone, a Personal Digital Assistant (PDA), the above mentioned electronic equipment, and the like. The network device may include, but is not limited to, a single network server, a server group of multiple network servers, or a cloud of numerous computers or network servers based on cloud computing, wherein cloud computing is one type of distributed computing, a super virtual computer consisting of a cluster of loosely coupled computers. The present embodiment does not limit this. The dynamically reconfigurable memory address mapping method includes steps S101 to S103, and specifically includes the following steps:
s101, obtaining configuration parameters of a chip and a memory concurrent access mode of a target application, and generating a memory address mapping relation of the target application based on the configuration parameters and the memory concurrent access mode.
Specifically, according to the scheme, the configuration parameters of the chip are combined with the specific memory concurrent access mode of the target application to generate the memory address mapping relation corresponding to the target application. For example, the target application a corresponds to a memory address mapping relation 1, and the target application B corresponds to a memory address mapping relation 2.
It can be appreciated that, because the dynamically configurable address mapping scheme in the prior art is only directed to graphics rendering type scenes, rather than general high-throughput computing, a set of methods for selecting the address mapping scheme according to application characteristics cannot be provided. The scheme can be combined with specific target application to generate a corresponding memory address mapping relation so as to adapt to more complex memory access modes in application scenes such as artificial intelligence and high-performance calculation.
In practical applications, the configuration parameters of the chip may be architecture parameters, memory system parameters, and scheduling policies of the chip.
It will be appreciated that many parameters of the chip affect memory level parallelism. For example, in the tFAW parameter in the DDR and GDDR standards, a memory bank is defined, and at most 4 different line caches are updated in each time window with the length of tFAW, and more than 4 line cache update requests can only be queued continuously, so that the tFAW limits the development of inter-cluster memory access parallelism in a single channel. These secondary parameters are different depending on the memory subsystem architecture and memory system parameters of the chip, and therefore the architecture parameters of the chip are acquired.
It will also be appreciated that different scheduling policies of the high-throughput computing chip may also change the memory access patterns of the application. Referring to fig. 3, fig. 3 shows a case, assuming that 16 threads in an application can be executed concurrently by 4 threads each time, each thread accesses a line of data in a memory in turn, the memory system has 4 channels, the high throughput computing chip in fig. 3 is scheduled in a manner of concurrent threads with line priority and column priority, x in txy is a line sequence number of a memory to be accessed by the thread, and y is a channel sequence number to be accessed. Fig. 4 shows channel sequence numbers to be accessed by each thread when two different scheduling policies are executed for the first time, if scheduling is performed according to the row-first order, then the concurrent memory accesses of the 4 concurrent threads may be uniformly distributed in the 4 channels, and if scheduling is performed according to the column-first order, then the concurrent memory accesses of the 4 concurrent threads are all concentrated in the same channel, because of resource contention for the channel, the performance of the column-first scheduling policy is much lower than that of the row-first scheduling policy. Therefore, if the scheduling policies are different, the memory concurrent access mode of the target application is also affected.
Therefore, according to the scheme, the memory address mapping relation is simultaneously acquired by combining the configuration parameters of the chip and the memory concurrent access mode of the target application.
It should be noted that after the memory address mapping relationship of the target application is generated based on the configuration parameters and the concurrent memory access mode, the target application and the memory address mapping relationship also need to be bound based on a preset position. It can be understood that, in order to call the memory address mapping relationship when the target application is executed subsequently, the two need to be bound together.
In an actual application, the preset position may be a position corresponding to metadata of an executable file of the target application, or a position corresponding to a page table entry of each data block in the target application. It will be appreciated that each application may specify the memory address mapping it needs in a particular location, such as the metadata of its executable file. Meanwhile, each data block in an application may specify a memory address mapping relationship required by the data block in a specific location, for example, a page table entry corresponding to the data block.
S102, receiving an execution request of a user to the target application, and calling the memory address mapping relation corresponding to the target application according to the execution request.
Specifically, in this step, when the target application is executed, the main controller detects an execution request for the target application, and since the memory address mapping relationship corresponding to the target application is already generated in step S101, the target application may call the memory address mapping relationship to execute the operation in step S103.
S103, dynamically configuring the memory subsystem of the chip according to the memory address mapping relation.
It can be understood that, when the high throughput computing chip executes the target application, the memory subsystem of the chip is dynamically configured according to the selected memory address mapping relationship, and the data stored in the memory subsystem is directly migrated and configured.
It should be noted that, in the prior art, although a memory mapping scheme also exists in the memory subsystem, the switching of the memory mapping scheme is static, because data needs to be retained in the memory subsystem for a long time in a conventional computing chip such as a CPU, and if the memory mapping scheme needs to be dynamically switched, the address of the same data storage before and after switching is changed, thereby causing the data to be invalid. Therefore, the existing scheme can switch the memory mapping scheme when the chip is restarted, that is, only static switching is supported.
It should be further noted that, compared with the prior art, in the high throughput chip of this scheme, data in the memory subsystem may be frequently migrated between the memory subsystem and the external storage under the control of one host controller, and the migration process is completely controlled by the host controller, so that the memory mapping scheme of the data block may be switched each time a data block is migrated into the memory system, thereby improving the performance of the high throughput chip and reducing the power consumption thereof.
Based on the foregoing embodiment, a specific implementation manner of the step S102 (obtaining the memory concurrent access mode of the target application) may be:
in some embodiments, the memory concurrent access mode of the target application is obtained, and the memory concurrent access mode includes static analysis and dynamic analysis, wherein the static analysis may be applied to some rules, and the dynamic analysis may be applied to some non-rules, so as to adapt to more complex memory access modes existing in application scenarios such as artificial intelligence and high-performance computing.
Static analysis:
and acquiring code information of the target application, and acquiring a memory concurrent access mode of the target application according to the code information.
It is understood that static analysis can obtain a fixed step size and a known specific memory access pattern such as space linking (e.g., Z-Morton, Hilbert) by analyzing the source code of the target application.
Dynamic analysis:
and acquiring the bit turning rate of the memory access stream of the target application, and acquiring the memory concurrent access mode of the target application based on the bit turning rate.
It should be noted that, for an irregular application and an application whose memory access mode is sensitive to a specific scheduling policy, the bit flipping rate of the memory access stream of the target application may be obtained, and then the memory concurrent access mode of the target application is obtained according to the bit flipping rate.
In practical applications, the apparatus for recording the bit flipping rate as shown in fig. 5 may be used, and the bit flipping rate of the target application in a certain time period (a preset time period) of the memory access stream may be counted on a simulator or a practical chip, so as to analyze a concurrent memory access mode that is difficult to be found by a conventional static analysis in the target application.
It should be noted that, in order to reduce the data amount of dynamic analysis, the present solution may perform sampling analysis on a segment of a target application, that is, analyze data within the preset time period, and need not analyze all data, thereby reducing power consumption.
Referring to fig. 6, which is a schematic structural diagram of a dynamically reconfigurable memory address mapping apparatus according to an embodiment of the present invention, the dynamically reconfigurable memory address mapping apparatus 60 includes:
the mapping module 61 is configured to obtain configuration parameters of a chip and a memory concurrent access mode of a target application, and generate a memory address mapping relationship of the target application based on the configuration parameters and the memory concurrent access mode;
a calling module 62, configured to receive an execution request of the target application from a user, and call the memory address mapping relationship corresponding to the target application according to the execution request;
and the execution module 63 is configured to dynamically configure the memory subsystem of the chip according to the memory address mapping relationship.
The apparatus in the embodiment shown in fig. 6 can be correspondingly used to perform the steps in the method embodiment shown in fig. 2, and the implementation principle and technical effect are similar, which are not described herein again.
Referring to fig. 7, which is a schematic diagram of a hardware structure of a dynamically reconfigurable memory address mapping device according to an embodiment of the present invention, the dynamically reconfigurable memory address mapping device 70 includes: a processor 71, a memory 72 and computer programs; wherein
A memory 72 for storing the computer program, which may also be a flash memory (flash). The computer program is, for example, an application program, a functional module, or the like that implements the above method.
A processor 71 for executing the computer program stored in the memory to implement the steps performed by the apparatus in the above method. Reference may be made in particular to the description relating to the preceding method embodiment.
Alternatively, the memory 72 may be separate or integrated with the processor 71.
When the memory 72 is a device separate from the processor 71, the apparatus may further include:
a bus 73 for connecting the memory 72 and the processor 71.
The present invention also provides a readable storage medium, in which a computer program is stored, which, when being executed by a processor, is adapted to implement the methods provided by the various embodiments described above.
The readable storage medium may be a computer storage medium or a communication medium. Communication media includes any medium that facilitates transfer of a computer program from one place to another. Computer storage media may be any available media that can be accessed by a general purpose or special purpose computer. For example, a readable storage medium is coupled to the processor such that the processor can read information from, and write information to, the readable storage medium. Of course, the readable storage medium may also be an integral part of the processor. The processor and the readable storage medium may reside in an Application Specific Integrated Circuits (ASIC). Additionally, the ASIC may reside in user equipment. Of course, the processor and the readable storage medium may also reside as discrete components in a communication device. The readable storage medium may be a read-only memory (ROM), a random-access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
The present invention also provides a program product comprising execution instructions stored in a readable storage medium. The at least one processor of the device may read the execution instructions from the readable storage medium, and the execution of the execution instructions by the at least one processor causes the device to implement the methods provided by the various embodiments described above.
In the above embodiments of the apparatus, it should be understood that the Processor may be a Central Processing Unit (CPU), other general purpose processors, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor, or in a combination of the hardware and software modules within the processor.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A dynamically reconfigurable memory address mapping method, comprising:
the method comprises the steps of obtaining configuration parameters of a chip and a memory concurrent access mode of a target application, and generating a memory address mapping relation of the target application based on the configuration parameters and the memory concurrent access mode, wherein the configuration parameters are parameters influencing memory level parallelism, the memory concurrent access mode comprises static analysis and dynamic analysis, the static analysis aims at regular application, and the dynamic analysis aims at irregular application;
receiving an execution request of a user for the target application, and calling the memory address mapping relation corresponding to the target application according to the execution request;
and dynamically configuring the memory subsystem of the chip according to the memory address mapping relation.
2. The method of claim 1, wherein obtaining the memory concurrent access pattern of the target application comprises:
and acquiring code information of the target application, and acquiring a memory concurrent access mode of the target application according to the code information.
3. The method of claim 1 or 2, wherein obtaining the memory concurrent access pattern of the target application comprises:
and acquiring the bit turning rate of the memory access stream of the target application, and acquiring the memory concurrent access mode of the target application based on the bit turning rate.
4. The method of claim 3, wherein obtaining the bit flipping rate of the memory access stream of the target application comprises:
means for recording the bit flip rate according to simulator or said chip operation;
acquiring the bit flipping rate of the memory access stream of the target application based on the device; and acquiring the bit flipping rate of the memory access stream of the target application in a preset time period.
5. The method of claim 1, wherein the obtaining configuration parameters of the chip comprises:
and acquiring the architecture parameters, the memory system parameters and the scheduling strategy of the chip.
6. The method of claim 1, after generating the memory address mapping relationship for the target application based on the configuration parameters and the concurrent memory access mode, further comprising:
and binding the target application and the memory address mapping relation based on a preset position.
7. The method of claim 6, wherein the preset position comprises:
and the position of the metadata of the executable file corresponding to the target application or the position of the page table entry corresponding to each data block in the target application.
8. A dynamically reconfigurable memory address mapping device, comprising:
the mapping module is used for acquiring configuration parameters of a chip and a memory concurrent access mode of a target application, and generating a memory address mapping relation of the target application based on the configuration parameters and the memory concurrent access mode, wherein the configuration parameters are parameters influencing memory level parallelism, the memory concurrent access mode comprises static analysis and dynamic analysis, the static analysis aims at regular application, and the dynamic analysis aims at irregular application;
the calling module is used for receiving an execution request of a user for the target application and calling the memory address mapping relation corresponding to the target application according to the execution request;
and the execution module is used for dynamically configuring the memory subsystem of the chip according to the memory address mapping relation.
9. A dynamically reconfigurable memory address mapping device, comprising: memory, a processor and a computer program, the computer program being stored in the memory, the processor running the computer program to perform the method of any of claims 1 to 7.
10. A readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 1 to 7.
CN202111155689.1A 2021-09-30 2021-09-30 Dynamic reconfigurable memory address mapping method and device Active CN113590508B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111155689.1A CN113590508B (en) 2021-09-30 2021-09-30 Dynamic reconfigurable memory address mapping method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111155689.1A CN113590508B (en) 2021-09-30 2021-09-30 Dynamic reconfigurable memory address mapping method and device

Publications (2)

Publication Number Publication Date
CN113590508A CN113590508A (en) 2021-11-02
CN113590508B true CN113590508B (en) 2022-02-11

Family

ID=78242554

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111155689.1A Active CN113590508B (en) 2021-09-30 2021-09-30 Dynamic reconfigurable memory address mapping method and device

Country Status (1)

Country Link
CN (1) CN113590508B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114996201B (en) * 2022-07-28 2022-09-30 沐曦科技(成都)有限公司 Routing system based on Die interconnection
CN117827702A (en) * 2022-09-28 2024-04-05 深圳市中兴微电子技术有限公司 Memory access method and system, electronic device and computer readable storage medium
CN115374022B (en) * 2022-10-27 2023-02-07 北京象帝先计算技术有限公司 Memory access method, device and system and electronic equipment
CN118113500A (en) * 2022-11-30 2024-05-31 华为技术有限公司 Memory mapping method and related equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103164368A (en) * 2013-03-29 2013-06-19 惠州Tcl移动通信有限公司 Method and system enabling embedded device to be compatible with different address mapping internal storage chips
CN105426324A (en) * 2014-05-29 2016-03-23 展讯通信(上海)有限公司 Memory access control method and apparatus of terminal device
CN111078301A (en) * 2018-10-22 2020-04-28 致茂电子(苏州)有限公司 Multi-core arithmetic device and operation method thereof

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10037228B2 (en) * 2012-10-25 2018-07-31 Nvidia Corporation Efficient memory virtualization in multi-threaded processing units

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103164368A (en) * 2013-03-29 2013-06-19 惠州Tcl移动通信有限公司 Method and system enabling embedded device to be compatible with different address mapping internal storage chips
CN105426324A (en) * 2014-05-29 2016-03-23 展讯通信(上海)有限公司 Memory access control method and apparatus of terminal device
CN111078301A (en) * 2018-10-22 2020-04-28 致茂电子(苏州)有限公司 Multi-core arithmetic device and operation method thereof

Also Published As

Publication number Publication date
CN113590508A (en) 2021-11-02

Similar Documents

Publication Publication Date Title
CN113590508B (en) Dynamic reconfigurable memory address mapping method and device
WO2021254135A1 (en) Task execution method and storage device
US12020065B2 (en) Hierarchical processor selection
US9772881B2 (en) Hardware resource allocation for applications
US11500828B1 (en) Method and device for constructing database model with ID-based data indexing-enabled data accessing
US11836087B2 (en) Per-process re-configurable caches
WO2024119988A1 (en) Process scheduling method and apparatus in multi-cpu environment, electronic device, and medium
US20220382672A1 (en) Paging in thin-provisioned disaggregated memory
EP4209914A1 (en) Reconfigurable cache architecture and methods for cache coherency
CN115981833A (en) Task processing method and device
CN114489475B (en) Distributed storage system and data storage method thereof
CN118312102A (en) IO request processing method and device, storage equipment and storage medium
EP4158485A1 (en) Inference in memory
US20240152278A1 (en) Apparatus and method for dynamically reconfiguring memory region of memory device
Guan et al. Crane: mitigating accelerator under-utilization caused by sparsity irregularities in cnns
US12086622B2 (en) Optimizing virtual machine scheduling on non-uniform cache access (NUCA) systems
US20190034339A1 (en) Cache utility modeling for automated cache configuration
CN116450055B (en) Method and system for distributing storage area between multi-processing cards
CN117311910B (en) High-performance virtual password machine operation method
US20240070107A1 (en) Memory device with embedded deep learning accelerator in multi-client environment
US11487582B2 (en) Information processing apparatus and computer-readable recording medium having stored therein process allocation determining program
US20230214271A1 (en) Method of scheduling cache budget in multi-core processing device and multi-core processing device performing the same
CN115562830A (en) Host bus adapter tuning method and device, electronic equipment and storage medium
CN118642857A (en) Task distribution method, device, system, equipment, medium and product
CN116954918A (en) Memory management method, memory device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant