WO2015100674A1 - 数据迁移方法、装置和处理器 - Google Patents

数据迁移方法、装置和处理器 Download PDF

Info

Publication number
WO2015100674A1
WO2015100674A1 PCT/CN2013/091232 CN2013091232W WO2015100674A1 WO 2015100674 A1 WO2015100674 A1 WO 2015100674A1 CN 2013091232 W CN2013091232 W CN 2013091232W WO 2015100674 A1 WO2015100674 A1 WO 2015100674A1
Authority
WO
WIPO (PCT)
Prior art keywords
page
shared virtual
virtual memory
partition
memory
Prior art date
Application number
PCT/CN2013/091232
Other languages
English (en)
French (fr)
Inventor
林擎天
熊哲皓
王卓立
颜友亮
朱望斌
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201380002713.5A priority Critical patent/CN104956341A/zh
Priority to EP13900674.6A priority patent/EP3062229A4/en
Priority to PCT/CN2013/091232 priority patent/WO2015100674A1/zh
Publication of WO2015100674A1 publication Critical patent/WO2015100674A1/zh
Priority to US15/197,358 priority patent/US20160306741A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1009Address translation using page tables, e.g. page table structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3037Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a memory, e.g. virtual memory, cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/81Threshold
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • G06F2212/1024Latency reduction

Definitions

  • Embodiments of the present invention relate to a storage technology, and in particular, to a data migration method, apparatus, and processor. Background technique
  • Multi-core technology enables servers to process tasks in parallel, which in the past may require the use of multiple processors, previously Multiple processors may be required, and multi-core systems are easier to expand, allowing for more powerful processing in a slimmer form factor that uses less power and generates less heat.
  • An addressable on-chip memory is configured on the chip for multiple processor cores.
  • on-chip memory relative to off-chip memory
  • the access speed is fast, but the on-chip memory space is small, for example, provided in the prior art.
  • 384KB of on-chip memory is configured on the chip; however, the inventor found that the access time of the on-chip memory is affected by the on-chip network. Impact, for example, between the core and the core network communication, and a communication network between the core and the memory chip, the greater the distance, the longer the access time.
  • DSM distributed shared memory model
  • the address of the process accessing the shared virtual memory space does not belong to the memory maintained by itself, it needs to be from another The physical memory space maintained by the process copies data to the local physical memory space. Due to the network on the chip, the delay is often large. Summary of the invention
  • Embodiments of the present invention provide a data migration method, apparatus, and processor for efficiently managing on-chip memory and reducing delay caused by an on-chip network.
  • an embodiment of the present invention provides a data migration method, which is characterized in that, in a multi-core system, the multi-core system includes a processor having multiple processor cores, and is configured with distributed on-chip memory.
  • the distributed on-chip memory is divided into a plurality of on-chip memory partitions, and according to the principle of proximity, a plurality of processor cores are allocated in the plurality of on-chip memory partitions, and the plurality of core systems operate in the same a process of the application, the piece of on-chip memory between the processes having a shared virtual memory space, the method comprising: the first process obtaining the access of the first shared virtual memory page by the processor core set in each on-chip memory partition Frequency; the first process is any one of a plurality of processes belonging to the same application; the first shared virtual memory page is any shared virtual memory page in the shared virtual memory space; The access frequency of the processor core set is the sum of the number of access times of all processor cores belonging to an on-chip memory partition; The process determines that the first shared virtual memory page is accessed by the processor core set in the second on-chip memory partition by a processor in the first on-chip memory partition where the physical page corresponding to the first shared virtual memory page is located The access frequency of the core set is higher
  • the embodiment of the present invention provides a first possible implementation manner, where the multi-core system sets corresponding pages in the on-chip memory partition for multiple processes belonging to the same application.
  • the page directory table records a correspondence between the physical page in the on-chip memory and a shared virtual memory page belonging to the shared virtual memory space, and the shared virtual memory page is per-on-chip memory
  • An access frequency of the processor core set in the partition the first process acquires an access frequency of the first shared virtual memory page by the processor core set in each on-chip memory partition, including: the first process by querying the page
  • the directory table obtains an access frequency of the first shared virtual memory page by a processor core set in each on-chip memory partition; and the first process moves the data in the physical page corresponding to the first shared virtual memory page to After the second on-memory memory partition, the method further includes:
  • the first process updates the physical page corresponding to the first shared virtual memory page to the
  • the embodiment of the present invention provides a second possible implementation manner of the first aspect, where the multi-core system further stores a page directory history table corresponding to the page directory table. And storing the shared virtual memory page removed from the page directory table, and the historical access frequency of the shared virtual pool, the method further includes: the first process accessing the second of the shared virtual memory space Sharing a virtual memory page; when a page fault occurs in the many-core system, the first process searches the page directory table for a physical page corresponding to the second shared virtual memory page; The object corresponding to the second shared virtual memory page is not found in the table.
  • the embodiment of the present invention further provides a third possible manner of the first aspect, after the second shared virtual memory page is found in the page directory history table.
  • the method further includes: obtaining, by the first process, a historical access frequency of the processor core set in the area of the second shared virtual memory page from the page directory history table; according to the second share
  • a second possible implementation manner of the first aspect is added to the target physical page corresponding to the second shared virtual memory page.
  • the embodiment of the present invention further provides a fourth possible manner of the first aspect, where the same
  • the plurality of processes belonging to the application include a second process of maintaining a correspondence between the virtual page and the physical page having the shared virtual memory space; and when the second shared virtual memory is found in the page directory history table After the page, it also includes:
  • the first process sends a request to the second process, the request is used to request the second process Determining, in the on-chip memory partition, whether there is a target physical page that satisfies a preset rule in the order of the historical access frequency of the processing core set access, until the target physical page is obtained, and the second shared virtual The data corresponding to the memory is moved out to the target physical page; and the target physical page corresponding to the second shared virtual memory page is added to the page directory table.
  • the embodiment of the present invention provides a fifth possible implementation manner of the first aspect, when the second shared virtual memory page is not found in the page directory history table, Determining, by the on-chip memory partition where the processor core running the first process is located, whether there is a target physical page that meets a preset rule;
  • the first process moves the data corresponding to the second shared virtual memory page to the target physical page;
  • the embodiment of the present invention further provides a sixth possible implementation manner of the first aspect, where the on-chip memory partition is Each physical page has a slot number in the partition; the physical page that meets the preset rule in the method includes: the first process obtaining an index value according to the second shared virtual page address; The physical page that satisfies the preset rule is a physical page that has a slot number that matches the index value and is free.
  • the embodiment of the present invention provides a seventh possible implementation manner of the first aspect, when all the on-chip memory partitions in the many-core system have the index value
  • the physical page matching the slot number is not a free physical page.
  • the method further includes:
  • the embodiment of the present invention further provides an eighth possible implementation manner, where the page directory table includes multiple entries, and one entry records a physical page in the on-chip memory.
  • the slot number of the physical page in each entry in the directory table is an index value of the entry; the first process searches the page of the second shared virtual memory page in the page directory table, including:
  • the ninth possible implementation manner of the first aspect is further provided by the embodiment of the present invention, where the page directory history table further records a shared virtual memory page. a sum of frequencies accessed by a set of processor cores of each of the on-chip memory partitions in a time period in the page directory table, the method further comprising:
  • the first process sequentially compares the frequency corresponding to each shared virtual memory page in the page directory history table from low to high. And discarding a preset number of shared virtual memory page information with the lowest frequency sum, the shared virtual memory page information includes a shared virtual memory page, and the shared virtual memory page is processed by each of the on-chip memory partitions The sum of the historical access frequency of the core set and the historical access frequency.
  • the embodiment of the present invention further provides a data migration apparatus, which is disposed in a many-core system, where the many-core system includes a processor having multiple processor cores, and is configured with distributed on-chip memory,
  • the distributed on-chip memory is divided into a plurality of on-chip memory partitions, and according to the principle of proximity, a plurality of processor cores are allocated in the plurality of on-chip memory partitions, and the multi-core system runs a plurality of processes belonging to the same application.
  • the process has a shared virtual memory space in the on-chip memory, the data migration device is integrated in the processor, the data migration device runs a first process, and the first process is the same Any one of a plurality of processes belonging to an application; the data migration device includes:
  • the access frequency obtaining unit is configured to obtain the first shared virtual memory page by each on-chip memory partition The access frequency of the set of processor cores; the first shared virtual memory page is any shared virtual memory page in the shared virtual memory space; the access frequency of the processor core set is attributed to all of the on-chip memory partitions The sum of the number of processor core accesses;
  • a migration judging unit configured to determine that the first shared virtual memory page is accessed by the processor core set in the second on-chip memory partition, and the first on-chip memory is located on the physical page corresponding to the first shared virtual memory page Whether the access frequency of the processor core set in the partition is higher than the first preset threshold;
  • a data migration unit configured to: when the determination result of the migration determination unit is YES, move data in a physical page corresponding to the first shared virtual memory page to the second on-chip memory partition.
  • the embodiment of the present invention provides a first possible implementation manner of the second aspect, where the multi-core system sets a corresponding page directory table for multiple processes belonging to one application in the on-chip memory partition. Recording, in the page directory table, a correspondence between a physical page in the on-chip memory and a shared virtual memory page belonging to the shared virtual memory space, and the shared virtual memory page is in each on-chip memory partition The access frequency of the processor core set;
  • the access frequency obtaining unit is configured to obtain, by querying the page directory table, an access frequency of the first shared virtual memory page by a processor core set in each on-chip memory partition;
  • the device further includes: a page directory updating unit, configured to: after the data in the physical page corresponding to the first shared virtual memory page is moved out to the second on-chip memory partition, the first shared virtual memory The physical page corresponding to the page is updated to a physical page in the second on-chip memory partition for storing the removed data.
  • a page directory updating unit configured to: after the data in the physical page corresponding to the first shared virtual memory page is moved out to the second on-chip memory partition, the first shared virtual memory The physical page corresponding to the page is updated to a physical page in the second on-chip memory partition for storing the removed data.
  • the embodiment of the present invention further provides a second possible implementation manner of the second aspect, where the multi-core system further stores a page directory calendar corresponding to the page directory table. a history table, configured to store a shared virtual memory page that is removed from the page directory table, and a history access frequency of the shared virtual device, the data migration device further includes:
  • An access unit configured to access a second shared virtual memory page in the shared virtual memory space; and a searching unit, configured to: in the page directory table, look for the first page when the page fault occurs in the kernel system a physical page corresponding to the shared virtual memory page; if no physical page corresponding to the second shared virtual memory page is found in the page directory table, searching for the first page in the page directory history table Two shared virtual memory pages;
  • the page directory update unit is further configured to: when the second shared virtual memory page is found in the page directory history table, move the second shared virtual memory page from the page directory history table to the location In the page directory table.
  • the embodiment of the present invention further provides a third possible implementation manner of the second aspect, where the data migration apparatus further includes:
  • a history access frequency obtaining unit configured to acquire the second shared core set from the page directory history table after the searching unit finds the second shared virtual memory page in the page directory history table a history access frequency
  • a page selection unit configured to determine whether the content is satisfied in the on-chip memory partition according to a sequence of historical access frequencies of the second shared virtual memory page acquired by the historical access frequency acquisition unit Presetting the target physical page of the rule until the target physical page is obtained
  • the data migration unit is further configured to move the data corresponding to the second shared virtual memory page to the target physical page
  • the page directory update unit is further configured to add a target physical page corresponding to the second shared virtual memory page in the page directory table.
  • the embodiment of the present invention further provides a fourth possible implementation manner of the second aspect, where the multiple processes belonging to the application program include maintaining the shared virtual memory space.
  • the second process of the correspondence between the virtual page and the physical page; the device further includes:
  • An instruction unit configured to send a request to the second process after the searching unit finds the second shared virtual memory page in the page directory history table, where the request is used to request the first partition Determining, in the on-chip memory partition, whether there is a target physical page that satisfies a preset rule in the order of the history access frequency of the processing core set access, until the target physical page is obtained, the second share is obtained The data corresponding to the virtual memory is moved out to the target physical page; and the target physical page corresponding to the second shared virtual memory page is added to the page directory table.
  • the embodiment of the present invention further provides a fifth possible implementation manner of the second aspect, where the apparatus further includes:
  • a page selection unit configured to: when the search unit does not find the second shared virtual memory page in the page directory history table, determine whether the on-chip memory partition where the data migration device is located has a preset rule a target physical page; if yes, notifying the data migration unit to move data corresponding to the second shared virtual memory page to the target physical page;
  • the data migration unit is notified to move the data corresponding to the second shared virtual memory page to the target physical page;
  • the page directory update unit is further configured to put a correspondence between the second shared virtual memory page and the target physical page into the page directory table, and record the second share in the page directory table.
  • the virtual memory page is accessed by the processor core set in each on-chip memory partition.
  • the embodiment of the present invention provides a sixth possible implementation manner of the second aspect, where each physical page in the on-chip memory partition has a partition
  • the physical number of the slot that meets the preset rule includes: the physical page that meets the preset rule includes: the data migration device obtains an index value according to the second shared virtual page address;
  • the physical page is a physical page that has a slot number that matches the index value and is free.
  • the implementation of the present invention further provides a seventh possible implementation manner of the second aspect, where the page directory history table further records a shared virtual memory page. The sum of the frequencies accessed by the processor core set of each of the on-chip memory partitions in the time period in the page directory table, the device further includes:
  • An entry dropping unit configured to: when the remaining storage space of the page directory history table is less than a second preset threshold, sum the frequency corresponding to each shared virtual memory page in the page directory history table from low to a high order, discarding a preset number of shared virtual memory page information having the lowest frequency sum, the shared virtual memory page information including a shared virtual memory page, and the shared virtual memory page being in each of the on-chip memory partitions Historical access frequency of the processor core set and the historical access frequency The sum of the rates.
  • an embodiment of the present invention provides a processor, which is used in a many-core system, where the multi-core system includes the processor, the processor includes multiple processor cores, and is configured with a distributed on-chip Memory, the processor running a plurality of processes belonging to the same application, the on-chip memory in the process has a shared virtual memory space, and the distributed on-chip memory is divided into a plurality of on-chip memory partitions, according to In the principle of proximity, a plurality of processor cores are allocated in the plurality of on-chip memory partitions, the processor comprising: a processor core, a memory, a communication interface, and a bus; wherein the processor core runs a first process, The first process is any one of a plurality of processes belonging to the same application;
  • the processor core, the communication interface, and the memory communicate with each other through the bus; the communication interface is configured to receive and transmit data; and the memory is configured to store a program;
  • the processor core is operative to execute the program in the memory, perform the method provided by the first aspect, and a possible implementation of any of the methods.
  • the virtual memory page in the virtual memory space shared among the multiple processes belonging to one application is the processor core in each on-chip partition.
  • the access frequency of the collection shifts the data corresponding to the virtual memory page to the on-chip memory partition where the processor core set with the high access frequency is located, and reduces the delay caused by the cross-partition access when accessing the virtual memory page.
  • FIG. 1 is a schematic diagram of an on-chip memory partition structure in a many-core system according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a data migration method according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a page placement process according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of searching for a target physical page according to an index value according to an embodiment of the present invention
  • FIG. 5 is a schematic structural diagram of a data migration apparatus according to an embodiment of the present invention
  • FIG. 6 is a schematic structural diagram of a processor according to an embodiment of the present invention. detailed description
  • the embodiment of the present invention is applied to a many-core system, where the multi-core system includes a processor having a plurality of processor cores, and running multiple processes belonging to different applications on the plurality of processor cores;
  • Distributed core memory is configured in the core system, and the distributed on-chip memory is divided into a plurality of on-chip memory partitions, and according to the principle of proximity, a plurality of processor cores are allocated in the plurality of on-chip memory partitions, each All of the processor cores included in the on-chip partition are called a processor core set; if the distributed on-chip memory is divided into four partitions, as shown in Figure 1, the on-chip memory partition in the many-core system, where, assuming The page page size is 4KB; Referring to FIG.
  • the data migration method includes: Step 201: First, the data migration method is included in the method of the data migration method. The process obtains the access frequency of the first shared virtual memory page by the processor core set in each on-chip memory partition;
  • the first process is any one of the multiple processes belonging to the same application;
  • the first shared virtual memory page is any shared virtual memory page in the shared virtual memory space;
  • the processor cores are aggregated into all processor cores in an on-chip partition.
  • the access frequency of the processor core set to a shared virtual page is: access to a shared virtual page by all processors belonging to an on-chip memory partition.
  • the sum of the times, for example, the sum of the access times of all the processors of an on-chip memory partition to the first shared virtual memory page is the frequency of the processor core set access of the on-chip memory partition;
  • the distributed on-chip memory configured in the many-core system is partitioned, and the processor core is allocated to the nearest on-chip memory partition according to the nearest principle, and does not exclude that one processor core is attributed to two adjacent ones.
  • On-chip memory partition even if there is no processor core in an on-chip memory partition;
  • the trigger process dynamically adjusts the data on the on-chip memory, for example, all processes belonging to the same application reach a global synchronization point, for example: bar r ier , which can be set according to actual conditions, the present invention
  • a global synchronization point for example: bar r ier , which can be set according to actual conditions
  • the present invention The embodiment does not limit this.
  • Step 202 The first process determines that the access frequency F2 of the first shared virtual memory page is greater than the first shared virtual memory page by the access frequency F2 of the processor core set in the second on-chip memory partition.
  • the access frequency F 1 of the processor core set in the first on-chip memory partition where the page is located is higher than the first preset threshold; if yes, the process proceeds to step 203; if not, the process ends;
  • the embodiment of the present invention is not limited, if the setting of the first preset threshold is set according to the actual situation.
  • Step 203 The first process moves data of the physical page in the on-chip memory corresponding to the first shared virtual memory page to the second on-chip memory partition.
  • the first process may implement the second on-chip memory partition by sending an instruction to move out data to the CPU;
  • the embodiment of the present invention refers to the on-chip memory partition where the physical page corresponding to the first shared virtual memory page is located as the first on-chip memory partition, and the first in the many-core system.
  • Any other on-chip memory partition other than the on-chip memory partition is referred to as a second on-chip memory partition;
  • the first process may request the operating system to establish a first shared virtual memory in a process page table managed by the operating system. Corresponding relationship between the page and the target physical page, and then the data is moved out by the first process to the target physical page in the memory on the second slice;
  • the on-chip memory partition where the processor core set with the highest frequency of access to the first shared virtual memory page is located may be selected as the second on-chip memory partition.
  • the virtual memory page in the virtual memory space shared among the multiple processes belonging to one application is the processor core in each on-chip partition.
  • the access frequency of the collection, the physical memory page of the on-chip memory corresponding to the virtual memory page The data of the face is moved out to the on-chip memory partition where the processor core set with high access frequency is located, and the subsequent delay of accessing the virtual memory page reduces the delay caused by the cross-partition access.
  • a corresponding page directory table is set in the on-chip memory partition for multiple processes belonging to the same application, and multiple processes in one application share a shared virtual memory space.
  • Corresponding relationship between the physical page in the on-chip memory and the shared virtual memory page belonging to the shared virtual memory space is recorded in the page directory table.
  • the sharing may be recorded in the page directory table.
  • the table also records the access frequency of each shared virtual memory page in the page directory table by the processor core set in each on-chip memory partition; it should be noted that: in the page directory table corresponding to an application The shared virtual memory page corresponds to the physical page in the on-chip memory, and therefore, the shared virtual The access frequency of the processor core set of each shared virtual memory page in the memory space is equal to the access frequency of the processor core set of the physical page corresponding to the shared virtual memory page;
  • Table 1 below is an example of the page directory table:
  • Virtual memory page address By default, a page is usually 4KB, so the lower three bits of the hexadecimal representation of the virtual memory page address are 0;
  • Physical page address The physical memory page address of the shared virtual memory page corresponding to the virtual memory page address in the on-chip memory of the partition; the physical page address may be the actual physical address of the physical memory page, or may be the on-chip memory for the partition. The offset of the starting physical address;
  • Frequency of access Record the frequency of access by the processor cores in each partition to the shared virtual memory page; since the on-chip memory is globally accessible, the processor cores in one partition can be accessed across partitions Ask on-chip memory in other partitions; summarize the access frequency of all shared virtual memory pages in all partitions in a partition to get the access frequency of a partitioned processor core set to the shared virtual memory page, so the access frequency f requenc ies In the field, each partition has only one access frequency value for the shared virtual memory page.
  • the Fr equenc i es field has 1 subfield for each partition to describe;
  • the calculation of the access frequency of the processor core set of one virtual memory page may be an aging mechanism, and the access frequency is a weighted sum of the access frequency of the process to the shared virtual memory page at each stage, and gives a relatively large weight to the new stage. , give the old stage a relatively small weight, so the resulting access frequency can better capture the process's recent access characteristics to the shared virtual memory page.
  • the first process obtains the access frequency of the first shared virtual memory page by the processor core set in each on-chip memory partition, and may obtain the first shared virtual memory page by querying the page directory table.
  • the method may further include: Step 204: The first process of the first shared virtual memory page is more limited by the on-chip memory space.
  • the data of the shared virtual page with high access frequency is stored in the physical page in the on-chip memory as much as possible.
  • the access frequency of the processor core set in the on-chip memory partition that is, the historical access frequency of the processor core set in each on-chip memory; similar to the page directory table, a page directory history table is set for an application for maintenance For example, the shared virtual memory page is deleted from the page directory table corresponding to the application; for example, the virtual directory space is shared between multiple processes belonging to one application, and the page directory table is The shared virtual memory space is established, so the page directory history table is also dedicated to maintaining a shared virtual memory page removed from the page directory table and the processor core of the shared virtual memory page being moved by the on-chip memory.
  • Table 2 below is an example of the page directory history table:
  • Virtual memory page address shared virtual memory page address; has the same meaning as the corresponding field in the preceding page directory entry;
  • Historical access frequency record the historical access frequency of the processor core set in each partition to the shared virtual memory page;
  • the sum of historical access frequencies the sum of the historical access frequency values of the processor core sets of each partition in the historical access frequency field; this field can reflect the relative importance of each shared virtual page in the page directory history table; Pages with a high total frequency of access are more important than pages with a low frequency. If the page directory history table is full, the entries corresponding to the pages with low total historical access frequency are discarded first. This will be described later.
  • the second shared virtual memory space when accessing the second shared virtual memory space in the shared virtual memory space, the second shared virtual memory space is not found in the page directory table, indicating that the current The physical memory page corresponding to the second shared virtual memory space is not in the on-chip memory, and the corresponding data of the second shared virtual memory page is placed in the on-chip memory in the on-chip memory, that is, the page placement process, which can be implemented by the following steps:
  • Step 305 The first process accesses a second shared virtual memory page in the shared virtual memory space.
  • Step 306 When a page fault occurs in the many-core system, the first process searches the page directory table for a physical page corresponding to the second shared virtual memory page; if not, enters Step 307;
  • the operating system maintains a process page table having a mapping relationship between a virtual address and a physical address for the operation of the application process, when the virtual page accessed by the process is not in the process page table, or when If the access is a write request, and the physical page corresponding to the virtual page in the process page table only receives the read request, a page fault (pagefaul t) may occur in the many-core system; when pagef aul t occurs, The first process searches the page directory table stored in the on-chip memory to find whether there is a physical page corresponding to the virtual page;
  • Step 307 Query whether there is the second shared virtual memory page in the page directory history table; if yes, proceed to step 308;
  • Step 308 Acquire a second shared virtual according to a historical access frequency of a processor core set of each on-chip memory partition in a time period of the shared virtual page recorded in the page directory history table in the page directory table. Whether there is a target physical page that meets the preset rule in the on-chip memory partition with the highest memory page history access frequency; if not, proceed to step 309b, and if yes, proceed to step 309a; Step 309a: the first process will be the first The data corresponding to the shared virtual memory page is moved out to the target physical page that meets the preset rule in the on-chip memory partition with the highest historical access frequency; and the process proceeds to step 310;
  • Step 309b sequentially determining the processor core according to a historical access frequency of the processor core set of each on-chip memory partition in the page directory history table, in descending order Whether there is a target physical page that meets the preset rule in the on-chip memory partition where the collection is located. Until the target physical page is obtained, the data corresponding to the second shared virtual memory is moved out to the target physical page; proceed to step 31 0;
  • Step 310 Move the second shared virtual memory page in the page directory history table to the page directory table.
  • the method further includes:
  • Step 311 Update a correspondence between the second shared virtual memory page and the physical page in the page directory table.
  • the aging mechanism is used for calculating the access frequency, and the historical access frequency in the page directory history table is not significant to the current value, so it can be moved out to the page directory table, or After being removed, it is not recorded in the page directory table;
  • the update is performed by the first process:
  • the second process may also be used: the multiple processes that belong to the application include a second process that maintains a correspondence between the virtual page and the physical page of the shared virtual memory space; the first process requests maintenance The second process that has the correspondence between the virtual page and the physical page of the shared virtual memory space performs an update:
  • the frequency is determined from the highest to the bottom, and whether the target physical page that meets the preset rule is determined in the on-chip memory partition where the processor core set is located, until the target physical page is obtained, and the second shared virtual memory corresponding data is removed to the location.
  • a corresponding relationship between the second shared virtual memory page and the target physical page is added to the page directory table.
  • only the shared virtual memory page removed from the page directory table and the shared virtual memory page corresponding to the address in the page directory table may be maintained in the page directory history table.
  • the access frequency of the processor core set in each on-chip partition, the page directory history table may not need to maintain the correspondence between the shared virtual memory page and the physical page; the data removed from the physical page of the on-chip memory is stored off-chip.
  • the new physical page in memory there is a special process to maintain the correspondence between the shared virtual memory page and the physical page. For example, under the Distributed Memory Sharing Model (DSM), it is maintained by the home process.
  • DSM Distributed Memory Sharing Model
  • the page directory history table is not found to be the same shared virtual memory page as the second shared virtual memory page, and may be directly in the on-chip memory partition of the processor core running the first process. If the second physical memory page is not found in the page directory history table, the method of the present invention may further include: Step 312: determining to process the first process Whether there is a target physical page in the on-chip memory partition where the core is located, if yes, proceed to step 316; if no, proceed to step 31 3;
  • the first process may request data corresponding to the second shared virtual memory page from a process that maintains a correspondence between the virtual page and the physical page of the shared virtual memory space; for convenience of description, the operating system is maintained.
  • the process of the correspondence between the virtual address and the physical address of the shared virtual memory space of the process page table is referred to as a second process; the second process and the first process belong to the same application;
  • Step 31 The first process determines, by near and far, whether there is a target physical page that satisfies the preset rule in the on-chip memory partition adjacent to the on-chip partition of the processor core running the first process. Until the target physical page that meets the preset rule is found, proceed to step 316; if it is found that each on-chip memory partition does not meet the physical page of the preset rule, proceed to step 314;
  • the preset rule for determining the target physical page may be set by the user according to requirements, for example, whether the physical page is an idle state as a preset rule, as long as the free physical page in the on-chip memory partition satisfies the preset rule.
  • Physical page; the specific rule for determining the target physical page can be set by the user according to actual needs;
  • a rule for determining a target physical page is provided, which is used to improve the search.
  • each physical page in the on-chip memory partition has a slot number in the partition, and the slot number of each physical page in a partition may be unique in the partition. It may also be repeated.
  • each partition in FIG. 4 has two identical slot numbers; wherein the black portion is the page directory table stored in the on-chip memory partition, and the white horizontal frame portion is the page directory history table.
  • the shared virtual memory address to be searched for to obtain an index value wherein the specific calculation manner may be performed by hashing, or other algorithms, which are not limited in the embodiment of the present invention; And searching for a slot number that matches the index value, determining whether the physical page in the slot number that matches the index value is an idle physical page, and if yes, determining that the physical page meets the preset rule;
  • the specific matching relationship is not limited in the embodiment of the present invention.
  • the two may be equal, that is, the matching succeeds, or a slot number corresponding to an index value range, and the shared virtual memory page to be searched for is obtained. If the index value falls within the range of index values corresponding to a slot number, the match is considered successful.
  • determining the target physical page that satisfies the preset rule in the foregoing step includes:
  • the first process obtains an index value according to the second shared virtual memory page; the physical page that meets the preset rule is a physical page that has a slot number that matches the index value and is idle.
  • step 31 1 when the target physical page that satisfies the preset rule is determined by using the index value and the slot number provided by the embodiment of the present invention, when all the on-chip memory partitions in the many-core system in step 31 1 have If the physical page of the slot number matching the index value is not a free physical page, proceed to step 314: in the physical memory page of the on-chip memory having the slot number matching the index value, select one corresponding The virtual shared memory page has the most frequently accessed physical page as the physical page Target physical page; go to step 315:
  • a physical page that satisfies the preset rule is not found on the on-chip memory, that is, in the case where the physical page on the slot number matching the index value of the second shared virtual memory page is not idle, the memory needs to be from the on-chip memory.
  • the elimination in the physical page is replaced by the data corresponding to the second shared virtual memory page. Therefore, before the replacement, the original data in the target physical page needs to be removed, and the original corresponding sharing in the page directory table is
  • the virtual memory page and the frequency of access by the processor core set in the on-chip memory partition are respectively shifted out to the page directory history table. Therefore, before step 316, the method may further include:
  • Step 315 Move the physical address of the target physical page to the shared virtual memory page in the page directory table and the frequency of access by the processor core set in the on-chip memory partition to the page directory history. In the table, the original data in the target physical page is removed;
  • Step 316 The first process moves data corresponding to the second shared virtual memory page to the target physical page.
  • Step 317 The first process records a correspondence between the second shared virtual memory page and a physical page in the page directory table, and records a processor core set in each on-chip memory partition to the second shared virtual The frequency of access to the memory page.
  • the placement of the virtual pages in the shared memory space on the on-chip memory is as much as possible according to the access frequency of the processor core set in each on-chip partition, and is placed in the on-chip memory partition with high access frequency as much as possible. Thereby reducing the access latency of the network on the chip.
  • the embodiment of the present invention further provides a method for storing the page directory table: the page directory table is partitioned by photo.
  • the page directory table is divided into the same number of on-chip memory partitions.
  • a subpage directory table the subpage directory table includes a plurality of entries, and an entry records the physical virtual page in the on-chip memory partition where the subpage directory table is located, and the shared virtual memory corresponding to the physical page.
  • the first process in the step 306, in the page directory table, whether to find a physical page corresponding to the second shared virtual memory page may include:
  • the first process searches for a second shared virtual memory page in a subpage directory table in an on-chip partition of a processor core running the first process, when a sub-partition in the on-chip partition where the first process is located If the second shared virtual memory page is not found in the page directory, the second shared virtual is searched in the subpage directory table in the remaining on-chip memory partitions in the many-core system according to the principle of near and far. The memory page until the same shared virtual memory page as the second shared virtual memory page is found or the subpage directory partition on all on-chip memory partitions in the many-core system has been queried.
  • the method provided by the embodiment of the present invention finds a slot that satisfies the preset rule, that is, the method of determining the slot according to the index value of the shared virtual page address, and then searching the page directory table.
  • the slot number of the physical page in each entry is used as an index value of the table entry; the first process searches for the same share in the page directory table in the on-chip partition as the second shared virtual memory page.
  • the virtual memory page can include:
  • each physical page in the on-chip memory partition has a slot number
  • an entry can correspond to the slot number of the physical page in the entry, and the slot number is used as the slot number.
  • the index value of the page directory entry is used as the slot number.
  • the page directory history table may be divided into subpage directory history tables having the same number of on-chip memory partitions.
  • each subpage directory history may be entered into the on-chip memory partition, as attached.
  • the white horizontal frame portion is the subpage directory history table, and the subpage directory is entered into the on-chip memory, such as the black portion in Fig. 4, thereby reducing the query time.
  • the page can also be used.
  • the directory table and the page directory history table are put into the off-chip memory, which is not limited by the embodiment of the present invention;
  • the sub-page directory history table stored in each of the on-chip memory partitions stores a shared virtual memory page removed from the sub-page directory table of the on-chip memory partition, and the moved shared virtual memory page is in the sub-page
  • searching for the second shared virtual memory page in the page directory history table may include:
  • the first process searches for a second shared virtual memory page in a subpage directory history table in an on-chip memory partition where the processor core running the first process is located;
  • the storage space of the page directory history table is limited. When the space is insufficient, some entries in the page directory history table need to be discarded, and the embodiment of the present invention provides another discard.
  • the page directory history table further records a sum of frequencies of accesses by the processor core set of each of the on-chip memory partitions in a time period in which the shared virtual memory page is in the page directory table, on the page.
  • the first process discards the sum of the frequencies corresponding to each shared virtual memory page in the page directory history table from low to high.
  • the shared virtual memory page information including a shared virtual memory page, and each of the on-chip memory in the time period of the on-chip memory of the shared virtual memory page
  • the frequency information of the processor core set access in the partition and the sum of the frequencies.
  • the on-chip memory in the many-core system is partitioned, and the page data is placed according to the access frequency of the processor on the on-chip memory page, so as to minimize the influence of the on-chip network on the on-chip access delay.
  • an embodiment of the present invention provides a data migration apparatus 50, which is disposed in a many-core system, where the multi-core system includes a processor having multiple processor cores, and is configured with distributed on-chip memory, the distribution.
  • the on-chip memory is divided into a plurality of on-chip memory partitions. According to the principle of proximity, a plurality of processor cores are allocated in the plurality of on-chip memory partitions, and the multi-core system runs a plurality of processes belonging to the same application.
  • the data migration device is integrated in the processor, and the data migration device runs a first process, The first process is any one of the plurality of processes belonging to the same application; the data migration device includes:
  • the access frequency obtaining unit 501 is configured to acquire an access frequency of the first shared virtual memory page by the processor core set in each on-chip memory partition; the first shared virtual memory page is any shared virtual in the shared virtual memory space. a memory page; the access frequency of the processor core set is the sum of the number of access times of all processor cores belonging to an on-chip memory partition;
  • the migration determining unit 502 is configured to determine that the first shared virtual memory page is accessed by the processor core set in the second on-chip memory partition, and the access frequency of the physical page corresponding to the first shared virtual memory page is on the first slice. Whether the access frequency of the processor core set in the memory partition is higher than the first preset threshold;
  • the data migration unit 503 is configured to: when the determination result of the migration determining unit 502 is YES, move data in the physical page corresponding to the first shared virtual memory page to the second on-chip memory partition.
  • the provided data migration device is partitioned on each slice according to the virtual memory page in the virtual memory space shared between the plurality of processes belonging to the same application.
  • the access frequency of the processor core set shifts the data corresponding to the virtual memory page to the on-chip memory partition where the processor core set with the high access frequency is located, and reduces the cross-partition access when accessing the virtual memory page subsequently. The delay caused.
  • a corresponding page directory table is set in the on-chip memory partition for a plurality of processes belonging to the same application, and the page directory table records the physical pages and the belongings in the on-chip memory.
  • the access frequency 501 obtaining unit is configured to obtain, by querying the page directory table, an access frequency of the first shared virtual memory page by a processor core set in each on-chip memory partition;
  • the data migration device further includes:
  • a page directory updating unit 504 configured to: after moving data in the physical page corresponding to the first shared virtual memory page to the second on-chip memory partition, the physical page corresponding to the first shared virtual memory page Updating to a physical page in the second on-chip memory partition for storing the removed data.
  • the page directory table is set to maintain the corresponding relationship between the physical page and the shared virtual page in the on-chip memory, and after the data migration in the on-chip memory, the page directory table is updated. And storing a shared virtual memory page removed from the page directory table, and a historical access frequency of the processor core set in each on-chip memory partition in a time period of the shared virtual memory page in the page directory table
  • the data migration device 50 further includes:
  • An access unit 505 configured to access a second shared virtual memory page in the shared virtual memory space
  • the searching unit 506 is configured to: when a page fault occurs in the many-core system, find, in the page directory table, whether there is a physical page corresponding to the second shared virtual memory page; when there is no search in the page directory table Go to the physical page corresponding to the second shared virtual memory page, and search for the second shared virtual memory page in the page directory history table;
  • the page directory updating unit 504 is further configured to: when the second shared virtual memory page is found in the page directory history table, use the second shared virtual memory page from the page directory history table. Move out to the page directory table. Further, after the information in the page directory history table is moved out to the page directory table, and the data corresponding to the second virtual shared virtual page is migrated to the on-chip memory, or before, the on-chip for storing the corresponding data needs to be selected.
  • the data migrating device 50 further includes: a history access frequency obtaining unit 507, configured to: after the searching unit finds the second shared virtual memory page in the page directory history table, Obtaining, in the page directory history table, a historical access frequency of the second set of processor cores;
  • the data migration unit is further used for And moving data corresponding to the second shared virtual memory page to the target physical page;
  • the page directory update unit is further configured to add a target physical page corresponding to the second shared virtual memory page in the page directory table.
  • the embodiment of the present invention further provides another manner, in which the plurality of processes belonging to the application program include a correspondence between the virtual page and the physical page that maintain the shared virtual memory space.
  • the second process; the page selection and the data migration may be requested to be performed, and the device further includes:
  • the instruction unit 509 is configured to send, after the searching unit 506 finds the second shared virtual memory page in the page directory history table, a request to the second process, where the request is used to request the
  • the second process sequentially selects, according to the historical access frequency of the processing core set of each on-chip memory partition, from the highest to the lowest in the on-chip memory time period of the second shared virtual memory page. Determining whether there is a target physical page that satisfies the preset rule in the on-chip memory partition, until the target physical page is obtained, and the data corresponding to the second shared virtual memory is moved out to the target physical page; in the page directory table Adding the target physical page corresponding to the second shared virtual memory page.
  • the searching unit 506 if the searching unit 506 does not find the second shared virtual memory page in the page directory history table, the data migration device,
  • the page selection unit 508 is configured to determine, when the search unit 506 does not find the second shared virtual memory page in the page directory history table, whether the on-chip memory partition where the data migration device is located is satisfied.
  • the target physical page of the preset rule is configured to determine, when the search unit 506 does not find the second shared virtual memory page in the page directory history table, whether the on-chip memory partition where the data migration device is located is satisfied.
  • the data migration unit 503 is notified to move the data corresponding to the second shared virtual memory page to the target physical page;
  • the page directory updating unit 504 is further configured to put a correspondence between the second shared virtual memory page and the target physical page into the page directory table, and record the second in the page directory table.
  • the shared virtual memory page is accessed by the processor core set in each on-chip memory partition.
  • each physical page in the on-chip memory partition has a slot number in the partition;
  • the physical page that meets the preset rule includes:
  • the physical page that meets the preset rule includes: the data migration device obtains an index value according to the second shared virtual page address;
  • the physical page that satisfies the preset rule is a physical page that has a slot number that matches the index value and is free.
  • the page selection unit 508 is further configured to: when all the physical pages in the on-chip memory partition in the many-core system and having the slot number matching the index value are not idle physical pages, Selecting, from the physical page having the slot number that matches the index value, a physical page having the lowest access frequency of the corresponding virtual shared memory page as the target physical page;
  • the data migration unit 503 is further configured to: before moving the data corresponding to the second shared virtual memory page to the target physical page, the physical address of the target physical page is originally in the page directory table.
  • the corresponding shared virtual memory page and the frequency respectively accessed by the processor core set in the on-chip memory partition are moved out to the page directory history table, and the original data in the target physical page is removed.
  • the page directory table provided by the embodiment of the present invention may be composed of a sub-page directory table stored in an on-chip memory partition, where the sub-page directory table includes multiple entries, and one entry records the sub-page directory table.
  • Correspondence between a physical page in the on-chip memory partition and the shared virtual memory page of the shared virtual memory space corresponding to the physical page, and the shared virtual memory page is respectively processed by the processor core in each on-chip memory partition. The frequency of access to the collection; therefore:
  • the searching unit 506, in the page directory table, whether to find a physical page corresponding to the second shared virtual memory page may include:
  • the page directory table includes multiple entries, and one entry records a physical page and the physical page in the on-chip memory.
  • the embodiment of the present invention further provides an implementation manner of how to handle the page directory history table when the storage space is insufficient: determining the sum of the historical access frequencies of the shared virtual memory pages in the page directory history table. Discard some entries, so:
  • the data migration device may further include:
  • An entry discarding unit 510 configured to: when the remaining storage space of the page directory history table is less than a second preset threshold, sum the frequency corresponding to each shared virtual memory page in the page directory history table from a low Up to a high order, discarding a preset number of shared virtual memory page information having the lowest frequency sum, the shared virtual memory page information including a shared virtual memory page, and the shared virtual memory page being partitioned by each of the on-chip memories Historical access frequency and history of the processor core set in The sum of the access frequencies.
  • the device in this embodiment may be used to perform the method in the foregoing method embodiment, and the implementation principle and the technical effect are similar, and details are not described herein again.
  • an embodiment of the present invention further provides a processor 60, which is applied to a many-core system, where the multi-core system includes the processor, the processor includes multiple processor cores, and is configured with a distribution.
  • the processor runs a plurality of processes belonging to the same application, the process has a shared virtual memory space in the on-chip memory, and the distributed on-chip memory is divided into a plurality of on-chip memory partitions.
  • a plurality of processor cores are allocated in the plurality of on-chip memory partitions, and the processor includes: a processor core 601, a memory 602, a communication interface 603, a bus 604, and the processor core 601.
  • Running a first process the first process being any one of the plurality of processes belonging to the same application;
  • the processor core 601, the communication interface 603, and the memory 602 communicate with each other through the bus 604; the communication interface 603 is configured to receive and transmit data; the memory 602 is configured to store a program; For performing the program in the memory, performing any of the foregoing embodiments of the present invention.
  • a person skilled in the art can understand that all or part of the steps of implementing the above method embodiments may be completed by using hardware related to the program instructions.
  • the foregoing program may be stored in a computer readable storage medium, and the program is executed when executed.
  • the steps of the foregoing method embodiments are included; and the foregoing storage medium includes: a medium that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

本发明实施例中,通过将众核系统中的片上内存进行分区,根据将同属于一个应用程序的多个进程间共享的虚拟内存空间中的虚拟内存页面被每个片上分区中的处理器核集合的访问频率,将虚拟内存页面对应的数据移出至访问频率高的处理器核集合所在的片上内存分区,在后续对该虚拟内存页面访问时,减少了因跨分区访问而造成的时延。

Description

技术领域
本发明实施例涉及存储技术, 尤其涉及一种数据迁移方法、装置和处理器。 背景技术
计算机上不断涌现的新兴使用模式让最终用户对处理器的处理能力一一 即性能一一提出了更高的要求,并且对性能每年提高的幅度还在不断加速,而 在一个芯片上建造多个 CPU核,而不是建造单核的多核技术是提高处理器性能 是目前行之有效的方法. 多核技术能够使服务器并行处理任务, ,而在 以前, 这可能需要使用多个处理器, 以前,这可能需要使用多个处理器, 并且多核系 统更 易于扩充,能够在更纤巧的外形中融入更强 大的处理性能,这种外形所 用的功耗更低,产生的热量更少. 现有技术中,在芯片上为多个处理器核配置了可寻址的片上内存, 片上内 存相对于片外内存的一个优点就是访问速度快,但片上内存的空间较小,比如, 现有技术中提供的有的芯片上配置了 384KB的片上内存; 但是, 发明人发现, 片上内存的访问时间受到片上网络的距离影响,例如核与核之间进行通信的网 络, 以及核与片上内存之间的通信网络, 距离越远, 访问时间越长。 例如, 在 软件緩存一致性的场景下的分布式共享内存模型( Di s tr ibuted Shared Memory Model , DSM )中, 在进程访问共享虚拟内存空间的地址不属于自己维护的内存 时, 需要从别的进程维护的物理内存空间中拷贝数据到本地物理内存空间, 因 片上网络的原因, 往往时延比较大。 发明内容
本发明实施例提供一种数据迁移方法、装置及处理器, 用于高效管理片上 内存, 减少片上网络引起的时延。 第一方面, 本发明实施例提供一种数据迁移方法, 其特征在于, 应用于众 核系统中, 所述众核系统中包括具有多个处理器核的处理器, 并配置有分布式 片上内存, 所述分布式片上内存被划分为多个片上内存分区, 根据就近原则, 将多个处理器核在所述多个片上内存分区中进行分配 ,所述众核系统中运行有 多个属于同一应用程序的进程,所述进程间的所述片上内存中拥有一段共享虚 拟内存空间, 所述方法包括: 第一进程获取第一共享虚拟内存页面被每个片上内存分区中处理器核集 合的访问频率;所述第一进程为所述同属于一个应用程序中的多个进程中任一 进程;所述第一共享虚拟内存页面为所述共享虚拟内存空间中任一共享虚拟内 存页面;所述处理器核集合的访问频率为归属于一个片上内存分区中所有处理 器核访问次数的总和; 所述第一进程判断所述第一共享虚拟内存页面被第二片上内存分区中处 理器核集合的访问频率比被所述第一共享虚拟内存页面对应的物理页面所在 的第一片上内存分区中的处理器核集合的访问频率高出第一预设阈值;
则所述第一进程将所述第一共享虚拟内存页面对应的物理页面中的数据 移出至所述第二片上内存分区中。 结合第一方面, 本发明实施例给了第一种可能的实现方式, 所述的众核系 统中将片上内存分区中针对同属于一个应用程序的多个进程设置对应的页目 录表,所述的页目录表中记录有所述片上内存中物理页面与属于所述共享虚拟 内存空间中的共享虚拟内存页面之间的对应关系,及该共享虚拟内存页面被每 个片上内存分区中的处理器核集合的访问频率; 所述第一进程获取第一共享虚拟内存页面被每个片上内存分区中处理器 核集合的访问频率, 包括: 所述第一进程通过查询所述页目录表获取所述第一共享虚拟内存页面被 每个片上内存分区中处理器核集合的访问频率; 在所述第一进程将所述第一共享虚拟内存页面对应的物理页面中的数据 移出至所述第二片上内存分区中之后, 还包括:
所述第一进程将所述第一共享虚拟内存页面对应的物理页面更新为所述
结合第一方面的第一种可能实现方式,本发明实施例提供了第一方面的第 二种可能的实现方式,所述众核系统中还存储有对应所述页目录表的页目录历 史表, 用于存储从所述页目录表中移出的共享虚拟内存页面, 以及该共享虚拟 合的历史访问频率, 所述方法还包括: 所述第一进程访问所述共享虚拟内存空间中的第二共享虚拟内存页面; 当众核系统中出现缺页错误时,所述第一进程在所述页目录表中查找是否 有与所述第二共享虚拟内存页面对应的物理页面; 当在所述页目录表中没有查找到与所述第二共享虚拟内存页面对应的物 理页面, 则在所述页目录历史表中查找是否有所述第二共享虚拟内存页面; 当在所述页目录历史表中查找到所述第二共享虚拟内存页面,将所述第二 共享虚拟内存页面从所述页目录历史表中移出至所述页目录表中。 结合第一方面的第二种可能的实现方式,本发明实施例还提供了第一方面 的第三种可能方式,当在所述页目录历史表中查找到所述第二共享虚拟内存页 面之后, 还包括: 所述第一进程从所述页目录历史表中获取所述第二共享虚拟内存页面在 区中的处理器核集合的历史访 问频率; 根据所述第二共享
区的所述处理核集合访问的历史访问频率从高到低的顺序,依次在所述片上内 存分区中确定是否有满足预设规则的目标物理页面, 直到获得目标物理页面, 在所述页目录表中添加与所述第二共享虚拟内存页面对应的目标物理页 面 结合第一方面的第二种可能的实现方式,本发明实施例还提供了第一方面 的第四种可能方式,所述同属于应用程序的多个进程中包括维护有所述共享虚 拟内存空间的虚拟页面和物理页面之间对应关系的第二进程;当在所述页目录 历史表中查找到所述第二共享虚拟内存页面之后, 还包括:
所述第一进程向所述第二进程发送请求,所述请求用于请求所述第二进程 所述处理核集合访问的历史访问频率从高到低的顺序,依次在所述片上内存分 区中确定是否有满足预设规则的目标物理页面, 直到获得目标物理页面,将所 述第二共享虚拟内存对应的数据移出至所述目标物理页面中;在所述页目录表 中添加与所述第二共享虚拟内存页面对应的所述目标物理页面。
结合第一方面第二种可能的实现方式, 本发明实施例提供了第一方面的第五 种可能实现方式, 当在所述页目录历史表中没有查找到所述第二共享虚拟内存页 面, 则判断运行所述第一进程的处理器核所在的片上内存分区是否有满足预设规 则的目标物理页面;
如果有, 所述第一进程将所述第二共享虚拟内存页面对应的数据移出到所述 目标物理页面中;
如果否, 则由近及远的判断所述运行所述第一进程的处理器核所在的片上内 存分区临近的片上内存分区中是否有满足预设规则的目标物理页面, 直到查找到 满足所述预设规则的目标物理页面; 将所述第二共享虚拟内存页面对应的数据移 出至所述目标物理页面中;
所述第一进程将所述第二共享虚拟内存页面与所述目标物理页面的对应 关系放入所述页目录表中 ,在所述页目录表中记录所述第二共享虚拟内存页面 被每个片上内存分区中的处理器核集合的访问频率。 结合本发明实施例提供的第一方面的第三种至第五种任一可能的实现方 式, 本发明实施例还提供了第一方面的第六种可能的实现方式, 所述片上内存 分区中的每个物理页面具有在分区内的槽位号;所述方法中满足预设规则的物 理页面包括: 所述第一进程根据所述第二共享虚拟页面地址获得索引值; 所述满足预设规则的物理页面为具有与所述索引值匹配的槽位号的,且空 闲的物理页面。 结合第一方面的第六种可能的实现方式,本发明实施例提供了第一方面的 第七种可能的实现方式,当所述众核系统中所有片上内存分区中与具有与所述 索引值匹配槽位号的物理页面都不是空闲物理页面,在所述与具有与所述索引 值匹配的槽位号的物理页面中 ,选取一个其对应的虚拟共享内存页面的访问频 率最低的物理页面作为目标物理页面; 所述将所述第二共享虚拟内存页面对应的数据移出至所述目标物理页面 中之前, 还包括:
将所述目标物理页面的物理地址在所述页目录表中原本对应的共享虚拟 内存页面以及分别被所述片上内存分区中的处理器核集合访问的频率移出至 所述页目录历史表中, 并将所述目标物理页面中原有数据移出。 结合第一方面的第六种可能的实现方式,本发明实施例还提供了第八种可 能实现方式, 所述页目录表包括多个表项, 一个表项记录所述片上内存中一个 物理页面与该物理页面对应的所述共享虚拟内存空间的共享虚拟内存页面之 间的对应关系以及该共享虚拟内存页面分别被所述每个片上内存分区中的处 理器核集合的访问频率;所述页目录表中每个表项中物理页面的槽位号作为表 项的索引值; 所述第一进程在所述页目录表中查找所述第二共享虚拟内存页 面, 包括:
所述第一进程根据所述第二共享虚拟内存页面地址计算获得一个索引值; 根据所述表项的索引值,在所述页目录表中确定具有和所述获得的索引值 匹配的表项;判断所述匹配的表项中的共享虚拟内存页面与所述第二共享虚拟 内存页面是否相同。 结合第一方面的二种至第五种任一种可能实现方式,本发明实施例还提供 了第一方面的第九种可能实现方式,所述页目录历史表中还记录有共享虚拟内 存页面在所述页目录表中的时间段内被每个所述片上内存分区的处理器核集 合访问的频率的总和, 所述方法还包括:
在所述页目录历史表剩余存储空间小于第二预设阈值时,所述第一进程将 所述页目录历史表中的每个共享虚拟内存页面对应的所述频率总和从低到高 的顺序,丟弃所述频率总和最低的预设数量的共享虚拟内存页面信息, 所述共 享虚拟内存页面信息包括共享虚拟内存页面,以及该共享虚拟内存页面被每个 所述片上内存分区中的处理器核集合的历史访问频率和所述历史访问频率的 总和。 第二方面,本发明实施例还提供了一种数据迁移装置,设置于众核系统中, 所述众核系统包括具有多个处理器核的处理器, 并配置有分布式片上内存, 所 述分布式片上内存被划分为多个片上内存分区,根据就近原则,将多个处理器 核在所述多个片上内存分区中进行分配 ,所述众核系统中运行有多个属于同一 应用的进程, 所述进程间在片上内存中拥有一段共享虚拟内存空间, 所述数据 迁移装置集成于所述处理器中, 所述数据迁移装置中运行有第一进程, 所述第 一进程为所述同属于一个应用程序中的多个进程中任一进程;所述数据迁移装 置包括:
访问频率获取单元,用于获取第一共享虚拟内存页面被每个片上内存分区 中处理器核集合的访问频率;所述第一共享虚拟内存页面为所述共享虚拟内存 空间中任一共享虚拟内存页面;所述处理器核集合的访问频率为归属于一个片 上内存分区中所有处理器核访问次数的总和;
迁移判断单元,用于判断所述第一共享虚拟内存页面被第二片上内存分区 中处理器核集合的访问频率比被所述第一共享虚拟内存页面对应的物理页面 所在的第一片上内存分区中的处理器核集合的访问频率是否高出第一预设阈 值;
数据迁移单元, 用于当所述迁移判断单元的判断结果为是时,将所述第一 共享虚拟内存页面对应的物理页面中的数据移出至所述第二片上内存分区中。
结合第二方面, 本发明实施例提供第二方面的第一种可能实现方式, 所述 的众核系统中将片上内存分区中针对同属于一个应用程序的多个进程设置对 应的页目录表,所述的页目录表中记录有所述片上内存中物理页面与属于所述 共享虚拟内存空间中的共享虚拟内存页面之间的对应关系,及该共享虚拟内存 页面被每个片上内存分区中的处理器核集合的访问频率;
所述访问频率获取单元,具体用于通过查询所述页目录表获取所述第一共 享虚拟内存页面被每个片上内存分区中处理器核集合的访问频率;
所述装置还包括: 页目录更新单元, 用于在将所述第一共享虚拟内存页面 对应的物理页面中的数据移出至所述第二片上内存分区中之后,将所述第一共 享虚拟内存页面对应的物理页面更新为所述第二片上内存分区中用于存储所 述移出数据的物理页面。
结合第二方面的第一种可能实现方式,本发明实施例还提供了第二方面的 第二种可能实现方式,所述众核系统中还存储有对应所述页目录表的页目录历 史表, 用于存储从所述页目录表中移出的共享虚拟内存页面, 以及该共享虚拟 合的历史访问频率, 所述数据迁移装置, 还包括:
访问单元, 用于访问所述共享虚拟内存空间中的第二共享虚拟内存页面; 查找单元, 用于当众核系统中出现缺页错误时,在所述页目录表中查找是 否有与所述第二共享虚拟内存页面对应的物理页面;当在所述页目录表中没有 查找到与所述第二共享虚拟内存页面对应的物理页面,则在所述页目录历史表 中查找是否有所述第二共享虚拟内存页面;
所述页目录更新单元,还用于当在所述页目录历史表中查找到所述第二共 享虚拟内存页面,将所述第二共享虚拟内存页面从所述页目录历史表中移出至 所述页目录表中。
结合第二方面的第二种可能实现方式,本发明实施例还提供了第二方面的 第三种可能实现方式, 所述数据迁移装置还包括:
历史访问频率获取单元,用于在所述查找单元在所述页目录历史表中查找 到所述第二共享虚拟内存页面之后,从所述页目录历史表中获取所述第二共享 核集合的历史访问频率; 页面选择单元,用于根据所述历史访问频率获取单元所获取的第二共享虚 拟内存页面的历史访问频率从高到低的顺序,依次在所述片上内存分区中确定 是否有满足预设规则的目标物理页面, 直到获得目标物理页面; 所述数据迁移单元 ,还用于将所述第二共享虚拟内存页面对应的数据移出 至所述目标物理页面中; 所述页目录更新单元,还用于在所述页目录表中添加与所述第二共享虚拟 内存页面对应的目标物理页面。
结合第二方面的第二种可能实现方式,本发明实施例还提供了第二方面的 第四种可能实现方式,所述同属于应用程序的多个进程中包括维护有所述共享 虚拟内存空间的虚拟页面和物理页面之间对应关系的第二进程;所述装置还包 括:
指令单元,用于在所述查找单元在在所述页目录历史表中查找到所述第二 共享虚拟内存页面之后, 向所述第二进程发送请求, 所述请求用于请求所述第 分区的所述处理核集合访问的历史访问频率从高到低的顺序,依次在所述片上 内存分区中确定是否有满足预设规则的目标物理页面, 直到获得目标物理页 面,将所述第二共享虚拟内存对应的数据移出至所述目标物理页面中; 在所述 页目录表中添加与所述第二共享虚拟内存页面对应的所述目标物理页面。
结合第二方面的第二种可能的实现方式, 本发明实施例还提供了第二方面的 第五种可能实现方式, 所述装置还包括:
页面选择单元, 用于当所述查找单元在所述页目录历史表中没有查找到所述 第二共享虚拟内存页面, 则判断所述数据迁移装置所在的片上内存分区是否有满 足预设规则的目标物理页面; 如果有,通知所述数据迁移单元将所述第二共享虚拟内存页面对应的数据 移出到所述目标物理页面中;
如果否,则由近及远的判断所述数据迁移装置所在的片上内存分区临近的 片上内存分区中是否有满足预设规则的目标物理页面,直到查找到满足所述预 设规则的目标物理页面后,通知所述数据迁移单元将所述第二共享虚拟内存页 面对应的数据移出到所述目标物理页面中;
所述页目录更新单元,还用于将所述第二共享虚拟内存页面与所述目标物 理页面的对应关系放入所述页目录表中,在所述页目录表中记录所述第二共享 虚拟内存页面被每个片上内存分区中的处理器核集合的访问频率。
结合第二方面的第三种或第五种可能的实现方式,本发明实施例提供了第 二方面的第六种可能的实现方式,所述片上内存分区中的每个物理页面具有在 分区内的槽位号; 所述满足预设规则的物理页面包括: 满足预设规则的物理页面包括: 所述数据迁移装置根据所述第二共享虚拟页面地址获得索引值; 所述满足预设规则的物理页面为具有与所述索引值匹配的槽位号的,且空 闲的物理页面。 结合第二方面的第二种至第五种任一可能实现方式,本发明实施还提供第 二方面的第七种可能实现方式,所述页目录历史表中还记录有共享虚拟内存页 面在所述页目录表中的时间段内被每个所述片上内存分区的处理器核集合访 问的频率的总和, 装置还包括:
表项丟弃单元,用于在所述页目录历史表剩余存储空间小于第二预设阈值 时,将所述页目录历史表中的每个共享虚拟内存页面对应的所述频率总和从低 到高的顺序,丟弃所述频率总和最低的预设数量的共享虚拟内存页面信息, 所 述共享虚拟内存页面信息包括共享虚拟内存页面,以及该共享虚拟内存页面被 每个所述片上内存分区中的处理器核集合的历史访问频率和所述历史访问频 率的总和。
第三方面, 本发明实施例提供一种处理器, 应用于众核系统中, 所述众核 系统中包括所述处理器, 所述处理器包括多个处理器核, 并配置有分布式片上 内存, 所述处理器中运行有多个属于同一应用程序的进程, 所述进程间所述片 上内存中拥有一段共享虚拟内存空间,所述分布式片上内存被划分为多个片上 内存分区,根据就近原则,将多个处理器核在所述多个片上内存分区中进行分 配, 所述处理器包括: 处理器核, 存储器, 通信接口, 总线; 所述处理器核中 运行有第一进程 ,所述第一进程为所述同属于一个应用程序中的多个进程中任 一进程;
所述处理器核、 通信接口、存储器通过所述总线相互的通信; 所述通信接 口, 用于接收和发送数据; 所述存储器用于存储程序;
所述处理器核用于执行所述存储器中的所述程序, 执行第一方面所提 供方法以及该方法任一所述的可能的实现方式。
本发明实施例中,通过将众核系统中的片上内存进行分区,根据将同属于 一个应用程序的多个进程间共享的虚拟内存空间中的虚拟内存页面被每个片 上分区中的处理器核集合的访问频率,将虚拟内存页面对应的数据移出至访问 频率高的处理器核集合所在的片上内存分区, 在后续对该虚拟内存页面访问 时, 减少了因跨分区访问而造成的时延。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施 例或现有技术描述中所需要使用的附图作一简单地介绍, 显而易见地, 下面描 述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出 创造性劳动性的前提下, 还可以根据这些附图获得其他的附图。
图 1为本发明实施例提供的众核系统中片上内存分区结构示意图; 图 2为本发明实施例提供的一种数据迁移方法流程图;
图 3为本发明实施例提供的页面放置流程示意图;
图 4为本发明实施例提供的按照索引值查找目标物理页面的示意图; 图 5为本发明实施例提供的数据迁移装置的结构示意图;
图 6为本发明实施例提供的处理器的结构示意图。 具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚, 下面将结合本发明 实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然, 所描述的实施例是本发明一部分实施例, 而不是全部的实施例。基于本发明中 的实施例 ,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其 他实施例, 都属于本发明保护的范围。
本发明实施例应用于众核系统中 ,所述众核系统中包括具有多个处理器核 的处理器, 在所述多个处理器核上运行有分别属于不同应用程序的多个进程; 所述众核系统中配置有分布式片上内存,所述分布式片上内存被划分为多个片 上内存分区,根据就近原则, 将多个处理器核在所述多个片上内存分区中进行 分配,每个片上分区中包括的所有处理器核我们叫做一个处理器核集合;假如 将分布式片上内存分为 4个分区,如图 1给出的众核系统中片上内存分区的示意 图, 其中, 假设一个页面 page的大小为 4KB; 参见图 2 , 前述的架构基础上, 以所述众核系统中运行于处理器核上的归 属于同一个应用程序的多个进程中的任一进程为例,对本发明实施例所提供的 一种数据迁移方法做详细描述, 其中, 所述同属于一个应用程序的多个进程间 在所述片上内存中拥有一段共享虚拟内存空间为例,所述数据迁移方法,包括: 步骤 201 : 第一进程获取第一共享虚拟内存页面被每个片上内存分区中处 理器核集合的访问频率;
其中, 所述第一进程为所述同属于一个应用程序中的多个进程中任一进 程;所述第一共享虚拟内存页面为所述共享虚拟内存空间中任一共享虚拟内存 页面; 所述的处理器核集合为一个片上分区中所有的处理器核, 所述处理器核 集合对一个共享虚拟页面的访问频率为:归属于一个片上内存分区中所有的处 理器核对一个共享虚拟页面的访问次数的总和,例如一个片上内存分区的所有 处理器核对所述第一共享虚拟内存页面的的访问次数总和就是这个片上内存 分区的处理器核集合访问的频率;
本发明实施例中将在众核系统中配置的分布式片上内存进行分区,处理器 核按照就近的原则分配给就近的片上内存分区中,并不排除一个处理器核被归 属于两个临近的片上内存分区,甚至一个片上内存分区中没有一个处理器核的 情况;
具体在什么样的情况下触发进程对片上内存上的数据进行动态调整, 例 如, 在属于同一应用程序的所有进程达到全局同步点, 例如: bar r ier , 可以 根据实际情况来设定, 本发明实施例对此并不做限定。
步骤 202: 所述第一进程判断所述第一共享虚拟内存页面被第二片上内存 分区中处理器核集合的访问频率 F2比所述第一共享虚拟内存页面对应的物理 页面所在的第一片上内存分区中的处理器核集合的访问频率 F 1高出第一预设 阈值; 如果是, 则进入步骤 203; 如果否, 结束流程;
其中,假如称所述第一进程对对所述第一预设阈值的设置根据实际情况设 定, 本发明实施例不做限定;
步骤 203 : 所述第一进程将所述第一共享虚拟内存页面对应的所述片上内 存中物理页面的数据移出至所述第二片上内存分区中;
具体实现中, 所述第一进程可以通过向 CPU发送移出数据的指令而实现将 述第二片上内存分区中;
其中, 为了使描述更清楚, 本发明实施例将所述第一共享虚拟内存页面对 应的物理页面所在的片上内存分区称为第一片上内存分区,将所述众核系统中 所述第一片上内存分区之外的其他任一片上内存分区称为第二片上内存分区; 在具体实现中,所述第一进程可以请求操作系统在操作系统管理的进程页 表中建立第一共享虚拟内存页面与目标物理页面的对应关系,然后由第一进程 将数据移出至所述第二片上内存中的所述目标物理页面中;
而如果同时有多个除所述第一片上内存分区之外的片上内存分区的处理 器核集合对所述第一共享虚拟内存页面的访问频率均高于所述第一预设阈值, 则可以选取对所述第一共享虚拟内存页面的访问频率最高的处理器核集合所 在的片上内存分区作为第二片上内存分区。
本发明实施例中,通过将众核系统中的片上内存进行分区,根据将同属于 一个应用程序的多个进程间共享的虚拟内存空间中的虚拟内存页面被每个片 上分区中的处理器核集合的访问频率,将虚拟内存页面对应的片上内存物理页 面的数据移出至访问频率高的处理器核集合所在的片上内存分区,在后续对该 虚拟内存页面访问时, 减少了因跨分区访问而造成的时延。
在具体的实现中,在所述的众核系统中将片上内存分区中针对同属于一个 应用程序的多个进程设置对应的页目录表,一个应用程序中的多个进程共享一 个共享虚拟内存空间,在所述的页目录表中记录有所述片上内存中物理页面与 属于所述共享虚拟内存空间中的共享虚拟内存页面之间的对应关系,具体实现 时,可以在页目录表中记录共享虚拟内存的页面的首地址和物理页面首地址的 对应关系,从而实现所述片上内存中物理页面与属于所述共享虚拟内存空间中 的共享虚拟内存页面之间的对应关系;在所述页目录表中还记录所述页目录表 中的每个共享虚拟内存页面分别被每个片上内存分区中的处理器核集合的访 问频率; 需要说明的是: 因为在对应一个应用程序的页目录表中共享虚拟内存 页面与片上内存中物理页面是对应的, 因此, 所述共享虚拟内存空间中每个共 享虚拟内存页面的处理器核集合的访问频率和共享虚拟内存页面对应的物理 页面的处理器核集合的访问频率是相等的;
下表 1是对所述页目录表进行举例:
虚拟内存页面地址: 默认情况下, 一个页面通常为 4KB , 因此虚拟内存页 面地址的 16进制表示的低三位为 0;
物理页面地址:虚拟内存页面地址所对应的共享虚拟内存页面在该分区的 片上内存中的物理内存页面地址;物理页面地址可以是该物理内存页面的实际 物理地址, 也可以是针对该分区片上内存起始物理地址的偏移;
访问频率: 记录各个分区中的处理器核对该共享虚拟内存页面的访问频 率; 由于片上内存是全局可访问的, 因此一个分区中的处理器核可以跨分区访 问其它分区中的片上内存;对一个分区中的所有处理器核对该共享虚拟内存页 面的访问频率进行汇总得到一个分区的处理器核集合对共享虚拟内存页面的 访问频率, 因此访问频率 f requenc ies字段中, 每个分区对该共享虚拟内存页 面只有 1个访问频率值。 Fr equenc i es字段对于每个分区都有 1个子字段来进行 描述;
Figure imgf000018_0001
Figure imgf000018_0002
其中,一个虚拟内存页面的被处理器核集合的访问频率的计算, 可以是采 用老化机制,访问频率是进程在各阶段对共享虚拟内存页面的访问频率的加权 和, 给新阶段比较大的权重, 给旧阶段比较小的权重, 这样得出的访问频率可 以更好的捕捉进程对共享虚拟内存页面最近的访问特性。
所述步骤 201中所述第一进程获取第一共享虚拟内存页面被每个片上内存 分区中处理器核集合的访问频率可以通过查询所述页目录表获取所述第一共 享虚拟内存页面被每个片上内存分区中处理器核集合的访问频率;
因为通过所述页目录表来维护所述多个进程间共享的虚拟内存空间的共 享虚拟内存页面和片上内存物理页面之间的对应关系,当所述第一共享虚拟内 存页面对应的数据移出至所述第二片上内存分区中,就需要更新所述第一共享 虚拟内存页面和物理页面的对应关系; 因此, 在步骤 203之后, 还可以包括: 步骤 204 : 所述第一进程将所述第一共享虚拟内存页面对应的物理页面更 因为片上内存空间有限,本发明实施例需要尽量将访问频率高的共享虚拟 页面的数据存放到片上内存中的物理页面中, 当片上内存存满时,如果有新访 问的共享虚拟内存页面,需要将片上内存中访问频率低的物理页面移出片上内 存,同时将所述页目录表中移出访问频率低的物理页面对应的共享虚拟内存页
片上内存分区中的处理器核集合的访问频率,即每个片上内存中的处理器核集 合的历史访问频率; 和所述页目录表相似,针对一个应用程序设置一个页目录 历史表用于维护与该应用程序对应的页目录表中被移出的共享虚拟内存页面 等信息;因本发明实施例是以同属于一个应用程序的多个进程间共享虚拟内存 空间为例, 所述页目录表针对所述共享虚拟内存空间所建立, 因此所述页目录 历史表也是专门用于维护从该页目录表中移出的共享虚拟内存页面以及被移 出的共享虚拟内存页面被所述片上内存的处理器核集合的历史访问频率; 下表 2是对所述页目录历史表进行举例:
虚拟内存页面地址: 共享虚拟内存页面地址; 与前述页目录表项中相应字 段含义相同;
历史访问频率:记录各个分区中的处理器核集合对该共享虚拟内存页面的 历史访问频率; 历史访问频率总和:将历史访问频率字段中各分区的处理器核集合的历史 访问频率值相加所得的总和; 该字段可以体现出页目录历史表中,各共享虚拟 页面的相对重要程度; 历史访问频率总和高的页面相对于频率低的页面来说, 更重要一些。如果在页目录历史表已满的情况下,优先丟弃总历史访问频率低 的页面对应的表项, 这个在后续会有介绍。
Figure imgf000020_0001
Figure imgf000020_0002
因此, 参见附图 3 , 本发明实施例在访问所述共享虚拟内存空间中第二共 享虚拟内存空间时, 如果第二共享虚拟内存空间在所述页目录表中没有查找 到,说明在当前的片上内存中没有与所述第二共享虚拟内存空间对应的物理页 面,在片上内存中放置所述第二共享虚拟内存页面对应数据到片上内存中,也 就是页面放置流程, 可以通过以下步骤实现:
步骤 305: 所述第一进程访问所述共享虚拟内存空间中的第二共享虚拟内 存页面;
步骤 306 : 当众核系统中出现缺页错误时, 所述第一进程在所述页目录表 中查找是否有与所述第二共享虚拟内存页面对应的物理页面; 如果否, 则进入 步骤 307;
在所述众核系统中 ,操作系统为应用程序的进程的运行维护了具有虚拟地 址和物理地址之间的映射关系的进程页表,当进程访问的虚拟页面没有在进程 页表中, 或者当访问为写请求, 而所述进程页表中虚拟页面对应的物理页面只 接收读请求等情况下, 在所述众核系统中会出现缺页错误(pagefaul t ) ; 当 出现 pagef aul t时, 所述第一进程在所述片上内存中存储的页目录表中去查找 是否有与虚拟页面对应的物理页面;
步骤 307:在所述页目录历史表中查找是否有所述第二共享虚拟内存页面; 如果是, 则进入步骤 308 ;
因为在页目录表中只维护当前片上内存中物理页面和共享虚拟内存页面 的对应关系, 因此, 当在所述页目录表中没有查找到所访问的共享虚拟页面, 就可以去查找所述的页目录历史表中是否有;
步骤 308 : 根据所述页目录历史表中记录的共享虚拟页面在所述页目录表 中的时间段内每个片上内存分区的处理器核集合的历史访问频率,获取对所述 第二共享虚拟内存页面历史访问频率最高的片上内存分区中是否有满足预设 规则的目标物理页面; 如果否, 则进入步骤 309b , 如果是, 则进入步骤 309a; 步骤 309a:所述第一进程将所述第二共享虚拟内存页面对应的数据移出至 所述历史访问频率最高的片上内存分区中满足预设规则的目标物理页面;进入 步骤 310;
步骤 309b:按照在所述页目录历史表中所述第二共享虚拟内存页面被每个 片上内存分区的处理器核集合的历史访问频率,从高到低的次序,依次判断所 述处理器核集合所在的片上内存分区中是否有满足预设规则的目标物理页面, 直到获得目标物理页面,将所述第二共享虚拟内存对应的数据移出至所述目标 物理页面中; 进入步骤 31 0;
步骤 310: 将所述页目录历史表中的所述第二共享虚拟内存页面移出至所 述页目录表中;
本发明实施例中在所述页目录历史表中查找到有与所述第二共享虚拟内 存页面相同的共享虚拟内存页面之后, 还包括:
步骤 311 : 更新所述页目录表中的所述第二共享虚拟内存页面和物理页面 的对应关系;
本发明实施例中对访问频率的计算采用的是老化机制,在所述页目录历史 表中的历史访问频率对当前的意义并不大,因此可以一同移出至所述页目录表 中, 也可以移出后不记录在所述页目录表中;
从上述实施例中可以看出在将所述页目录历史表中的共享虚拟内存页面 放置到所述页目录表中过程中,还需要在片上内存中找到物理页面用于存放共 享虚拟页面对应的数据,以最终完成所述页目录表中共享虚拟内存页面和物理 页面之间的对应关系, 具体更新的方式有多种, 本发明实施例提供两个具体的 方式;
方式一, 由所述第一进程来执行更新:
所述第一进程从所述页目录历史表中获取所述第二共享虚拟内存页面在 的历史访问频率;
根据每个片上内存分区的所述处理器核集合的历史访问频率从高到底顺 序,依次在所述片上内存分区中确定是否有满足预设规则的目标物理页面, 直 到获得目标物理页面,将所述第二共享虚拟内存对应数据移出至所述目标物理 页面中;
在所述页目录表中添加所述第二共享虚拟内存页面与所述目标物理页面 的对应关系。
此外,还可以用方式二: 所述同属于应用程序的多个进程中包括维护有所 述共享虚拟内存空间的虚拟页面和物理页面之间对应关系的第二进程;所述第 一进程请求维护有所述共享虚拟内存空间的虚拟页面和物理页面之间对应关 系的第二进程执行更新:
所述第一进程向所述第二进程发送请求,所述请求用于请求所述第二进程 根据每个片上内存分区中所述处理器核集合对所述第二共享虚拟内存页面的 历史访问频率从高到底顺序,依次判断所述处理器核集合所在的片上内存分区 中是否有满足预设规则的目标物理页面, 直到获得目标物理页面,将所述第二 共享虚拟内存对应数据移出至所述目标物理页面中,在所述页目录表中添加所 述第二共享虚拟内存页面与所述目标物理页面的对应关系。
本发明实施例中,在所述页目录历史表中可以只维护从所述页目录表中移 出的共享虚拟内存页面以及该地址对应的共享虚拟内存页面在所述页目录表 中的时间段内,被每个片上分区中的处理器核集合的访问频率, 页目录历史表 中可以不需要维护共享虚拟内存页面和物理页面的对应关系;从片上内存的物 理页面中移出的数据存储到片外内存中新的物理页面中,有专门的进程去维护 共享虚拟内存页面和物理页面的对应关系, 例如, 在分布式内存共享模型 ( Di s tr ibuted Shared Memory Model , DSM ) 下, 由 home进程维护共享虚拟内 存页面和物理页面的对应关系; 其中, 步骤 307中在所述页目录历史表也没有查找是与所述第二共享虚拟 内存页面相同的共享虚拟内存页面则可以直接在运行所述第一进程的处理器 核所在片上内存分区中查找是否有合适物理页面, 因此, 本发明实施例在在所 述页目录历史表也没有查找到所述第二共享虚拟内存页面, 还可以包括: 步骤 312 : 判断运行所述第一进程的处理器核所在的片上内存分区中是否 有满足预设规则的目标物理页面, 如果有, 则进入步骤 316 ; 如果否, 则进入 步骤 31 3;
所述第一进程可以从维护有所述共享虚拟内存空间的虚拟页面和物理页 面之间对应关系的进程请求所述第二共享虚拟内存页面对应的数据;为方便描 述,将维护操作系统中的进程页表的所述共享虚拟内存空间的虚拟地址和物理 地址之间对应关系的进程称为第二进程;所述第二进程和第一进程同属于一个 应用程序;
步骤 31 3 : 所述第一进程由近及远的判断与所述运行所述第一进程的处理 器核所在片上分区临近的片上内存分区中是否有满足所述预设规则的目标物 理页面, 直到查找到满足所述预设规则的目标物理页面, 则进入步骤 316 ; 若 查找完每个片上内存分区均没有满足所述预设规则的物理页面, 则进入步骤 314 ;
其中, 用于确定目标物理页面的预设规则, 用户可以根据需要来设定, 例 如,将物理页面是否是空闲状态作为预设规则, 只要片上内存分区中空闲的物 理页面就是满足预设规则的物理页面;具体的用于判断目标物理页面的规则可 以由用户根据实际需要设定;
本发明实施例中给出了一种用于判断目标物理页面的规则,用于提高查找 目标物理页面的效率, 参见图 4 , 所述片上内存分区中的每个物理页面具有在 所在分区内的槽位号,一个分区中每个物理页面的槽位号可以是在分区中唯一 的, 也可以是重复的, 例如图 4中的每个分区有两个相同的槽位号; 其中黑色 部分是存放在片上内存分区的所述页目录表,白色横框部分为所述页目录历史 表,后续将有详细描述; 将需要查找的共享虚拟内存地址进行计算获得一个索 引值, 其中具体的计算方式可以采用哈希计算, 也可以其他算法, 本发明实施 例不做限定; 在片上内存分区中查找和所述索引值匹配的槽位号, 判断和所述 索引值匹配的槽位号中的物理页面是否是空闲的物理页面,如果是, 则认为是 满足预设规则的目标物理页面;
其中, 具体的匹配关系本发明实施例不做限定, 例如, 可以设定两者相等 就是匹配成功,也可以是一个槽位号对应一个索引值范围, 所述要查找的共享 虚拟内存页面获得的索引值落入到了某个槽位号对应的索引值范围中,就认为 是匹配成功;
因此, 前述步骤中判断满足预设规则的目标物理页面包括:
所述第一进程根据所述第二共享虚拟内存页面获得一个索引值; 所述满足预设规则的物理页面为具有与所述索引值匹配的槽位号的,且空 闲的物理页面。
当采用本发明实施例所提供的通过索引值和槽位号来判断满足所述预设 规则的目标物理页面的情况下, 当步骤 31 1中所述众核系统中所有片上内存分 区中与具有所述索引值匹配的槽位号的物理页面都不是空闲物理页面, 则 进入步骤 314 : 在所述与具有与所述索引值匹配的槽位号的片上内存物理 页面中,选取一个其对应的虚拟共享内存页面的访问频率最 的物理页面作为 目标物理页面; 进入步骤 315 :
在片上内存上没有找到满足预设规则的物理页面 ,也就是说在于所述第二 共享虚拟内存页面的索引值匹配的槽位号上的物理页面都不是空闲的情况下, 需要从片上内存的物理页面中的淘汰,替换为所述第二共享虚拟内存页面对应 的数据, 因此, 在替换之前需要将目标物理页面中的原有的数据移出, 并将所 述页目录表中原本对应的共享虚拟内存页面以及分别被所述片上内存分区中 的处理器核集合访问的频率移出至所述页目录历史表中,因此在步骤 316之前, 还可以包括:
步骤 315 : 将所述目标物理页面的物理地址在所述页目录表中原本对应的 共享虚拟内存页面以及分别被所述片上内存分区中的处理器核集合访问的频 率移出至所述页目录历史表中, 并将所述目标物理页面中原有数据移出;
步骤 316 : 所述第一进程将所述第二共享虚拟内存页面对应的数据移出至 所述目标物理页面中;
步骤 317 : 所述第一进程将在所述页目录表中记录所述第二共享虚拟内存 页面和物理页面的对应关系 ,记录每个片上内存分区中处理器核集合对所述第 二共享虚拟内存页面的访问频率。
本发明实施例中,对共享内存空间中的虚拟页面在片上内存的放置尽量按 照各个片上分区中的处理器核集合对其的访问频率,将其尽量放置到访问频率 高的片上内存分区中, 从而减少在片上网络的访问时延。
参见上述图 1 ,为了提高在所述的页目录表查找虚拟内存页面地址的效率, 本发明实施例还提供一种存放所述页目录表的方式:将所述页目录表按照片上 分区的位置来存储, 例如,将所述页目录表分为和片上内存分区的数量相同数 量的子页目录表, 所述子页目录表中包括多个表项,一个表项记录所述子页目 录表所在的片上内存分区中一个物理页面与该物理页面对应的所述共享虚拟 内存空间的之间的对应关系以及该共享虚拟内存页面分别被所述每个片上内 存分区中的处理器核集合的访问频率; 图 1中黑色部分表示子页目录表, 白色 横框表示存储的页目录历史表。
在此基础上, 步骤 306中所述第一进程在所述页面目录表中查找是否有与 所述第二共享虚拟内存页面对应的物理页面, 可以包括:
所述第一进程在运行所述第一进程的处理器核所在片上分区中的子页目 录表中查找所述第二共享虚拟内存页面,当在所述第一进程所在的片上分区中 的子页目录中没有查找到所述第二共享虚拟内存页面, 则按照由近及远的原 则,在所述众核系统中的其余片上内存分区中的子页目录表中查找所述第二共 享虚拟内存页面,直到查找到与所述第二共享虚拟内存页面相同的共享虚拟内 存页面或则已经查询了所述众核系统中的所有片上内存分区上的子页目录分 区。
而如果片上内存的页面放置时,按照本发明实施例所提供的方法查找满足 预设规则的槽位, 也就是按照共享虚拟页面地址的索引值去确定槽位的方式, 那么在查找页目录表时, 每个表项中物理页面的槽位号作为所在表项的索引 值;所述第一进程在片上分区中的所述页目录表中查找与所述第二共享虚拟内 存页面相同的共享虚拟内存页面, 可以包括:
所述第一进程根据所述第二共享虚拟内存页面计算获得一个索引值; 根据所述表项的索引值,在所查找的所述页目录表中确定具有和所述获得 索引值匹配表项;判断所述匹配的表项中的共享虚拟内存页面与所述第二共享 虚拟内存页面是否相同。
在具体实现中, 因为片上内存分区中的每个物理页面有一个槽位号, 那么 在子页目录表中, 一个表项可以对应该表项中物理页面的槽位号,将槽位号作 为了页目录表项的索引值。
进一步, 也可以对所述页目录历史表划分为和片上内存分区数量相同的 子页目录历史表, 同样参见图 4 , 可以将每个子页目录历史^文入在片上内存 的分区上, 如附图 4中白色横框部分为子页目录历史表, 而将子页目录 ^文入 到片上内存中, 如附图 4中的黑色部分, 以此减少查询时间, 当然, 也可以将 所述页目录表和所述页目录历史表放入到片外内存中,本发明实施例对此不作 限定;
所述每个片上内存分区中存储的子页目录历史表中存储有从所述片上内 存分区的子页目录表中移出的共享虚拟内存页面,以及所述被移出的共享虚拟 内存页面在该子页目录表中的时间段内分别被每个片上内存分区中的处理器 核集合访问的历史访问频率;
在此基础上, 上述步骤 307中在所述页目录历史表中查找是否有所述第二 共享虚拟内存页面, 可以包括:
所述第一进程在运行所述第一进程的处理器核所在的片上内存分区中的 子页目录历史表中查找是否有所述第二共享虚拟内存页面;
当在所述第一进程在运行所述第一进程的处理器核所在的片上内存分区 中的子历史页目录中没有查找到所述第二共享虚拟内存页面,则按照由近及远 的原则,在所述众核系统中的其余片上内存分区中的子页目录历史表中查找所 述第二共享虚拟内存页面,直到查找到所述第二共享虚拟内存页面或者已经查 询了所述众核系统中的所有片上内存分区上的子页目录历史表。
在具体实现中,所述的页目录历史表的存储空间是有限的,当空间不够时, 需要将所述页目录历史表中的一些表项丟弃,本发明实施例提供另一种丟弃方 式:所述页目录历史表中还记录有共享虚拟内存页面在所述页目录表中的时间 段内被每个所述片上内存分区的处理器核集合访问的频率的总和,在所述页目 录历史表剩余存储空间小于第二预设阈值时,所述第一进程将所述页目录历史 表中的每个共享虚拟内存页面对应的所述频率总和从低到高的顺序,丟弃所述 频率总和最低的预设数量的共享虚拟内存页面信息,所述共享虚拟内存页面信 息包括共享虚拟内存页面,以及该共享虚拟内存页面所述片上内存的时间段内 分别被每个所述片上内存分区中的处理器核集合访问的频率信息和所述频率 的总和。
当然,在符合本发明实施例所提供的丟弃方式的精神下,也可以根据实际 情况由用户来设定。
本发明实施例通过将众核系统中的片上内存进行分区,根据处理器核对片 上内存页面的访问频率对页面数据进行放置,尽量减少因片上网络对片上访问 时延的影响。
参见图 5 , 本发明实施例提供一种数据迁移装置 50 , 设置于众核系统中, 所述众核系统包括具有多个处理器核的处理器, 并配置有分布式片上内存, 所 述分布式片上内存被划分为多个片上内存分区,根据就近原则,将多个处理器 核在所述多个片上内存分区中进行分配 ,所述众核系统中运行有多个属于同一 应用的进程, 所述进程间所述片上内存中拥有一段共享虚拟内存空间, 所述数 据迁移装置集成于所述处理器中, 所述数据迁移装置中运行有第一进程, 所述 第一进程为所述同属于一个应用程序中的多个进程中任一进程;所述数据迁移 装置包括:
访问频率获取单元 501 , 用于获取第一共享虚拟内存页面被每个片上内存 分区中处理器核集合的访问频率;所述第一共享虚拟内存页面为所述共享虚拟 内存空间中任一共享虚拟内存页面;所述处理器核集合的访问频率为归属于一 个片上内存分区中所有处理器核访问次数的总和;
迁移判断单元 502 , 用于判断所述第一共享虚拟内存页面被第二片上内存 分区中处理器核集合的访问频率比被所述第一共享虚拟内存页面对应的物理 页面所在的第一片上内存分区中的处理器核集合的访问频率是否高出第一预 设阈值;
数据迁移单元 503 , 用于当所述迁移判断单元 502的判断结果为是时,将所 述第一共享虚拟内存页面对应的物理页面中的数据移出至所述第二片上内存 分区中。
本发明实施例中通过将众核系统中的片上内存进行分区,所提供的数据迁 移装置根据将同属于一个应用程序的多个进程间共享的虚拟内存空间中的虚 拟内存页面被每个片上分区中的处理器核集合的访问频率,将虚拟内存页面对 应的数据移出至访问频率高的处理器核集合所在的片上内存分区,在后续对该 虚拟内存页面访问时, 减少了因跨分区访问而造成的时延。
进一步,所述的众核系统中将片上内存分区中针对同属于一个应用程序的 多个进程设置对应的页目录表,所述的页目录表中记录有所述片上内存中物理 页面与属于所述共享虚拟内存空间中的共享虚拟内存页面之间的对应关系,及 该共享虚拟内存页面被每个片上内存分区中的处理器核集合的访问频率; 所述访问频率 501获取单元, 具体用于通过查询所述页目录表获取所述第 一共享虚拟内存页面被每个片上内存分区中处理器核集合的访问频率;
所述数据迁移装置还包括:
页目录更新单元 504 , 用于在将所述第一共享虚拟内存页面对应的物理页 面中的数据移出至所述第二片上内存分区中之后 ,将所述第一共享虚拟内存页 面对应的物理页面更新为所述第二片上内存分区中用于存储所述移出数据的 物理页面。
通过设置页目录表来维护片上内存中物理页面和共享虚拟页面的对应关 系, 在片上内存中进行数据迁移后, 将所述页目录表进行更新。 于存储从所述页目录表中移出的共享虚拟内存页面,以及该共享虚拟内存页面 在所述页目录表中的时间段内分别被每个片上内存分区中的处理器核集合的 历史访问频率, 所述数据迁移装置 50, 还包括:
访问单元 505 , 用于访问所述共享虚拟内存空间中的第二共享虚拟内存页 面;
查找单元 506 , 用于当众核系统中出现缺页错误时, 在所述页目录表中查 找是否有与所述第二共享虚拟内存页面对应的物理页面;当在所述页目录表中 没有查找到与所述第二共享虚拟内存页面对应的物理页面,则在所述页目录历 史表中查找是否有所述第二共享虚拟内存页面;
因此, 所述页目录更新单元 504 , 还用于当在所述页目录历史表中查找到 所述第二共享虚拟内存页面,将所述第二共享虚拟内存页面从所述页目录历史 表中移出至所述页目录表中。 进一步,在将页目录历史表中的信息移出至所述页目录表中,之后或者之 前将第二虚拟共享虚拟页面对应的数据迁移到片上内存中,需要选择用于存储 所述对应数据的片上内存物理页面; 因此, 所述数据迁移装置 50还包括: 历史访问频率获取单元 507 , 用于在所述查找单元在所述页目录历史表中 查找到所述第二共享虚拟内存页面之后 ,从所述页目录历史表中获取所述第二 理器核集合的历史访问频率;
共享虚拟内存页面的历史访问频率从高到低的顺序,依次在所述片上内存分区 中确定是否有满足预设规则的目标物理页面, 直到获得目标物理页面; 所述数据迁移单元 ,还用于将所述第二共享虚拟内存页面对应的数据移出 至所述目标物理页面中;
所述页目录更新单元,还用于在所述页目录表中添加与所述第二共享虚拟 内存页面对应的目标物理页面。
对于页面选择和数据迁移, 本发明实施例还提供另一种方式,在所述同属 于应用程序的多个进程中包括维护有所述共享虚拟内存空间的虚拟页面和物 理页面之间对应关系的第二进程; 在进行页面选择和数据迁移, 可以请求该进 行来进行, 因此, 所述装置还包括:
指令单元 509 ,用于在所述查找单元 506在在所述页目录历史表中查找到所 述第二共享虚拟内存页面之后, 向所述第二进程发送请求, 所述请求用于请求 所述第二进程根据所述第二共享虚拟内存页面在片上内存时间段内被每个片 上内存分区的所述处理核集合访问的历史访问频率从高到低的顺序 ,依次在所 述片上内存分区中确定是否有满足预设规则的目标物理页面,直到获得目标物 理页面,将所述第二共享虚拟内存对应的数据移出至所述目标物理页面中; 在 所述页目录表中添加与所述第二共享虚拟内存页面对应的所述目标物理页面。
本发明实施例中, 如果所述查找单元 506在所述页目录历史表中没有查找 到所述第二共享虚拟内存页面, 则所述数据迁移装置,
所述页面选择单元 508 ,用于当所述查找单元 506在所述页目录历史表中没 有查找到所述第二共享虚拟内存页面,则判断所述数据迁移装置所在的片上内 存分区是否有满足预设规则的目标物理页面;
如果有, 通知所述数据迁移单元 503将所述第二共享虚拟内存页面对应的 数据移出到所述目标物理页面中;
如果否,则由近及远的判断所述数据迁移装置所在的片上内存分区临近的 片上内存分区中是否有满足预设规则的目标物理页面,直到查找到满足所述预 设规则的目标物理页面后, 通知所述数据迁移单元 503将所述第二共享虚拟内 存页面对应的数据移出到所述目标物理页面中;
所述页目录更新单元 504 , 还用于将所述第二共享虚拟内存页面与所述目 标物理页面的对应关系放入所述页目录表中,在所述页目录表中记录所述第二 共享虚拟内存页面被每个片上内存分区中的处理器核集合的访问频率。
本发明实施例中,对查找的具体实现给出了一种实现方式, 所述片上内存 分区中的每个物理页面具有在分区内的槽位号;所述满足预设规则的物理页面 包括: 满足预设规则的物理页面包括: 所述数据迁移装置根据所述第二共享虚拟页面地址获得索引值; 所述满足预设规则的物理页面为具有与所述索引值匹配的槽位号的,且空 闲的物理页面。
在此基础上, 所述页面选择单元 508还用于, 当所述众核系统中所有片上 内存分区中与具有与所述索引值匹配槽位号的物理页面都不是空闲物理页面, 在所述与具有与所述索引值匹配的槽位号的物理页面中,选取一个其对应的虚 拟共享内存页面的访问频率最低的物理页面作为目标物理页面;
所述数据迁移单元 503 , 还用于在将所述第二共享虚拟内存页面对应的数 据移出至所述目标物理页面中之前,将所述目标物理页面的物理地址在所述页 目录表中原本对应的共享虚拟内存页面以及分别被所述片上内存分区中的处 理器核集合访问的频率移出至所述页目录历史表中 ,并将所述目标物理页面中 原有数据移出。
本发明实施例所提供的页目录表可以是由个片上内存分区中存储的子页 目录表组成, 所述子页目录表中包括多个表项,一个表项记录所述子页目录表 所在的片上内存分区中一个物理页面与该物理页面对应的所述共享虚拟内存 空间的共享虚拟内存页面之间的对应关系以及该共享虚拟内存页面分别被所 述每个片上内存分区中的处理器核集合的访问频率; 因此:
本发明实施例中所述查找单元 506在所述页目录表中查找是否有与所述第 二共享虚拟内存页面对应的物理页面, 可以包括包括:
在所述数据迁移装置所在片上分区中的子页目录表中查找所述第二共享 虚拟内存页面,当在所述数据迁移装置所在的片上分区中的子页目录中没有查 找到所述第二共享虚拟内存页面, 则按照由近及远的原则,在所述众核系统中 的其余片上内存分区中的子页目录表中查找所述第二共享虚拟内存页面,直到 查找到所述第二共享虚拟内存页面或则已经查询了所述众核系统中的所有片 上内存分区上的子页目录分区。
进一步, 本发明实施例中对所述页目录表给了一种具体实现方式, 例如: 所述页目录表包括多个表项,一个表项记录所述片上内存中一个物理页面与该 物理页面对应的所述共享虚拟内存空间的共享虚拟内存页面之间的对应关系 以及该共享虚拟内存页面分别被所述每个片上内存分区中的处理器核集合的 访问频率; 所述页目录表中每个表项中物理页面的槽位号作为表项的索引值; 因此,在所述查找单元具体查找查找所述第二共享虚拟内存页面, 可以包 括:
根据所述第二共享虚拟内存页面地址计算获得一个索引值;
根据所述表项的索引值,在所述页目录表中确定具有和所述获得的索引值 匹配的表项;判断所述匹配的表项中的共享虚拟内存页面与所述第二共享虚拟 内存页面是否相同。
进一步, 本发明实施例还给出了对页目录历史表存储空间不够的情况下, 如何处理的一种实现方式:根据所述页目录历史表中的共享虚拟内存页面的历 史访问频率总和来决定对某些表项进行丟弃, 因此:
所述数据迁移装置, 还可以包括:
表项丟弃单元 510 , 用于在所述页目录历史表剩余存储空间小于第二预设 阈值时,将所述页目录历史表中的每个共享虚拟内存页面对应的所述频率总和 从低到高的顺序, 丟弃所述频率总和最低的预设数量的共享虚拟内存页面信 息, 所述共享虚拟内存页面信息包括共享虚拟内存页面, 以及该共享虚拟内存 页面被每个所述片上内存分区中的处理器核集合的历史访问频率和所述历史 访问频率的总和。 本实施例的装置可以用于执行前述方法实施例的方法,其实现原理和技术 效果类似, 此处不再贅述。
参见图 6 , 本发明实施例还提供一种处理器 60 , 应用于众核系统中, 所述 众核系统中包括所述处理器, 所述处理器包括多个处理器核, 并配置有分布式 片上内存, 所述处理器中运行有多个属于同一应用程序的进程, 所述进程间在 片上内存中拥有一段共享虚拟内存空间 ,所述分布式片上内存被划分为多个片 上内存分区,根据就近原则, 将多个处理器核在所述多个片上内存分区中进行 分配, 所述处理器包括: 处理器核 601 ,存储器 602 , 通信接口 603 , 总线 604 ; 所述处理器核 601中运行有第一进程,所述第一进程为所述同属于一个应用程 序中的多个进程中任一进程;
所述处理器核 601、 通信接口 603、 存储器 602通过所述总线 604相互的 通信; 所述通信接口 603 , 用于接收和发送数据; 所述存储器 602用于存储程序; 所述处理器核 601用于执行所述存储器中的所述程序,执行前述本发明 实施例中的任一方法。 本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可 以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存 储介质中, 该程序在执行时, 执行包括上述方法实施例的步骤; 而前述的存储 介质包括: R0M、 RAM, 磁碟或者光盘等各种可以存储程序代码的介质。 最后应说明的是: 以上实施例仅用以说明本发明的技术方案, 而非对其限 制; 尽管参照前述实施例对本发明进行了详细的说明, 本领域的普通技术人员 应当理解: 其依然可以对前述各实施例所记载的技术方案进行修改, 或者对其 中部分技术特征进行等同替换; 而这些修改或者替换, 并不使相应技术方案的 本质脱离本发明各实施例技术方案的精神和范围。

Claims

权 利 要 求
1、 一种数据迁移方法, 其特征在于, 应用于众核系统中, 所述众核系统 中包括具有多个处理器核的处理器, 并配置有分布式片上内存, 所述分布式片 上内存被划分为多个片上内存分区,根据就近原则,将多个处理器核在所述多 个片上内存分区中进行分配,所述众核系统中运行有多个属于同一应用程序的 进程, 所述进程间在所述片上内存中拥有一段共享虚拟内存空间, 所述方法包 括: 第一进程获取第一共享虚拟内存页面被每个片上内存分区中处理器核集 合的访问频率;所述第一进程为所述同属于一个应用程序中的多个进程中任一 进程;所述第一共享虚拟内存页面为所述第一共享虚拟内存空间中任一共享虚 拟内存页面;所述处理器核集合的访问频率为归属于一个片上内存分区中所有 处理器核访问次数的总和; 所述第一进程判断所述第一共享虚拟内存页面被第二片上内存分区中处 理器核集合的访问频率比被所述第一共享虚拟内存页面对应的物理页面所在 的第一片上内存分区中的处理器核集合的访问频率高出第一预设阈值; 则所述第一进程将所述第一共享虚拟内存页面对应的所述片上内存的物 理页面中的数据移出至所述第二片上内存分区中。
2、 根据权利要求 1所述的方法, 其特征在于, 所述的众核系统中将片上内 存分区中针对同属于一个应用程序的多个进程设置对应的页目录表,所述的页 目录表中记录有所述片上内存中物理页面与属于所述共享虚拟内存空间中的 共享虚拟内存页面之间的对应关系,及该共享虚拟内存页面被每个片上内存分 区中的处理器核集合的访问频率; 所述第一进程获取第一共享虚拟内存页面被每个片上内存分区中处理器 核集合的访问频率, 包括: 所述第一进程通过查询所述页目录表获取所述第一共享虚拟内存页面被 每个片上内存分区中处理器核集合的访问频率; 在所述第一进程将所述第一共享虚拟内存页面对应的物理页面中的数据 移出至所述第二片上内存分区中之后, 还包括:
所述第一进程将所述第一共享虚拟内存页面对应的物理页面更新为所述
3、 根据权利要求 2所述的方法, 其特征在于, 所述众核系统中还存储有对 应所述页目录表的页目录历史表,用于存储从所述页目录表中移出的共享虚拟 片上内存分区中的处理器核集合的历史访问频率, 所述方法还包括: 所述第一进程访问所述共享虚拟内存空间中的第二共享虚拟内存页面; 当众核系统中出现缺页错误时,所述第一进程在所述页目录表中查找是否 有与所述第二共享虚拟内存页面对应的物理页面; 当在所述页目录表中没有查找到与所述第二共享虚拟内存页面对应的物 理页面, 则在所述页目录历史表中查找是否有所述第二共享虚拟内存页面; 当在所述页目录历史表中查找到所述第二共享虚拟内存页面,将所述第二 共享虚拟内存页面从所述页目录历史表中移出至所述页目录表中。
4、 根据权利要求 3所述的方法, 其特征在于, 当在所述页目录历史表中查 找到所述第二共享虚拟内存页面之后, 还包括: 所述第一进程从所述页目录历史表中获取所述第二共享虚拟内存页面在 片上内存时间段内分别被每个所述片上内存分区中的处理器核集合的历史访 问频率;
区的所述处理核集合访问的历史访问频率从高到低的顺序,依次在所述片上内 存分区中确定是否有满足预设规则的目标物理页面, 直到获得目标物理页面, 将所述第二共享 在所述页目录表中添加与所述第二共享虚拟内存页面对应的目标物理页
5、 根据权利要求 3所述的方法, 其特征在于, 所述同属于应用程序的多个 进程中包括维护有所述共享虚拟内存空间的虚拟页面和物理页面之间对应关 系的第二进程;当在所述页目录历史表中查找到所述第二共享虚拟内存页面之 后, 还包括: 所述第一进程向所述第二进程发送请求,所述请求用于请求所述第二进程 所述处理核集合访问的历史访问频率从高到低的顺序,依次在所述片上内存分 区中确定是否有满足预设规则的目标物理页面, 直到获得目标物理页面,将所 述第二共享虚拟内存对应的数据移出至所述目标物理页面中;在所述页目录表 中添加与所述第二共享虚拟内存页面对应的所述目标物理页面。
6、 根据权利要求 3所述的方法, 其特征在于:
当在所述页目录历史表中没有查找到所述第二共享虚拟内存页面, 则判断运 行所述第一进程的处理器核所在的片上内存分区是否有满足预设规则的目标物 理页面;
如果有, 所述第一进程将所述第二共享虚拟内存页面对应的数据移出到所述 目标物理页面中;
如果否 , 则由近及远的判断所述运行所述第一进程的处理器核所在的片上内 存分区临近的片上内存分区中是否有满足预设规则的目标物理页面, 直到查找到 满足所述预设规则的目标物理页面; 将所述第二共享虚拟内存页面对应的数据移 出至所述目标物理页面中; 所述第一进程将所述第二共享虚拟内存页面与所述目标物理页面的对应 关系放入所述页目录表中。 , 在所述页目录表中记录所述第二共享虚拟内存页 面被每个片上内存分区中的处理器核集合的访问频率。
7、 根据权利要求 4-6任一所述的方法, 其特征在于, 所述片上内存分区中 的每个物理页面具有在分区内的槽位号;所述方法中满足预设规则的物理页面 包括: 所述第一进程根据所述第二共享虚拟页面地址获得索引值; 所述满足预设规则的物理页面为具有与所述索引值匹配的槽位号的,且空 闲的物理页面。
8、 根据权利要求 7所述的方法, 其特征在于, 当所述众核系统中所有片上 内存分区中与具有与所述索引值匹配槽位号的物理页面都不是空闲物理页面, 在所述与具有与所述索引值匹配的槽位号的物理页面中,选取一个其对应的虚 拟共享内存页面的访问频率最低的物理页面作为目标物理页面; 所述将所述第二共享虚拟内存页面对应的数据移出至所述目标物理页面 中之前, 还包括: 将所述目标物理页面的物理地址在所述页目录表中原本对应的共享虚拟 内存页面以及分别被所述片上内存分区中的处理器核集合访问的频率移出至 所述页目录历史表中, 并将所述目标物理页面中原有数据移出。
9、 根据权利要求 3所述的方法, 其特征在于, 所述页目录表由每个片上内 存分区中存储的子页目录表组成, 所述子页目录表中包括多个表项,一个表项 记录所述子页目录表所在的片上内存分区中一个物理页面与该物理页面对应 的所述共享虚拟内存空间的共享虚拟内存页面之间的对应关系以及该共享虚
所述第一进程在所述页目录表中查找是否有与所述第二共享虚拟内存页 面对应的物理页面, 包括: 所述第一进程在运行所述第一进程的处理器核所在片上分区中的子页目 录表中查找所述第二共享虚拟内存页面,当在所述第一进程所在的片上分区中 的子页目录表中没有查找到所述第二共享虚拟内存页面,则按照由近及远的原 则,在所述众核系统中的其余片上内存分区中的子页目录表中查找所述第二共 享虚拟内存页面,直到查找到所述第二共享虚拟内存页面或则已经查询了所述 众核系统中的所有片上内存分区上的子页目录分区。
10、 根据权利要求 7所述的方法, 其特征在于, 所述页目录表包括多个表 项 ,一个表项记录所述片上内存中一个物理页面与该物理页面对应的所述共享 虚拟内存空间的共享虚拟内存页面之间的对应关系以及该共享虚拟内存页面 分别被所述每个片上内存分区中的处理器核集合的访问频率;所述页目录表中 每个表项中物理页面的槽位号作为表项的索引值;所述第一进程在所述页目录 表中查找所述第二共享虚拟内存页面, 包括:
所述第一进程根据所述第二共享虚拟内存页面地址计算获得一个索引值; 根据所述表项的索引值,在所述页目录表中确定具有和所述获得的索引值 匹配的表项;判断所述匹配的表项中的共享虚拟内存页面与所述第二共享虚拟 内存页面是否相同。
11、 根据权利要求 9或 1 0所述方法, 其特征在于, 所述页目录历史表由存 上内存分区中存储的子页目录历史表中存储有从所述片上内存分区的子页目 录表中移出的共享虚拟内存页面,以及所述被移出的共享虚拟内存页面在所述 子页目录表中的时间段内分别被每个片上内存分区中的处理器核集合访问的 历史访问频率;
当在所述页目录表中没有查找到与所述第二共享虚拟内存页面对应的物 理页面, 则在所述页目录历史表中查找是否有所述第二共享虚拟内存页面, 包 括:
当在所述页目录表中没有查找到与所述第二共享虚拟内存页面对应的物 理页面, 所述第一进程在运行所述第一进程的处理器核所在的片上内存分区中 的子页目录历史表中查找是否有所述第二共享虚拟内存页面;
当在所述第一进程在运行所述第一进程的处理器核所在的片上内存分区 中的子历史页目录中没有查找到所述第二共享虚拟内存页面, 则按照由近及远 的原则,在所述众核系统中的其余片上内存分区中的子页目录历史表中查找所 述第二共享虚拟内存页面, 直到查找到所述第二共享虚拟内存页面或者已经查
12、 根据权利要求 3-6任一或 9所述方法, 其特征在于, 所述页目录历史表 中还记录有共享虚拟内存页面在所述页目录表中的时间段内被每个所述片上 内存分区的处理器核集合访问的频率的总和, 所述方法还包括:
在所述页目录历史表剩余存储空间小于第二预设阈值时,所述第一进程将 所述页目录历史表中的每个共享虚拟内存页面对应的所述频率总和从低到高 的顺序,丟弃所述频率总和最低的预设数量的共享虚拟内存页面信息, 所述共 享虚拟内存页面信息包括共享虚拟内存页面,以及该共享虚拟内存页面被每个 所述片上内存分区中的处理器核集合的历史访问频率和所述历史访问频率的 总和。
1 3、 一种数据迁移装置, 其特征在于, 设置于众核系统中, 所述众核系统 包括具有多个处理器核的处理器, 并配置有分布式片上内存, 所述分布式片上 内存被划分为多个片上内存分区,根据就近原则,将多个处理器核在所述多个 片上内存分区中进行分配, 所述众核系统中运行有多个属于同一应用的进程, 所述进程间在所述片上内存中拥有一段共享虚拟内存空间,所述数据迁移装置 集成于所述处理器中, 所述数据迁移装置中运行有第一进程, 所述第一进程为 所述同属于一个应用程序中的多个进程中任一进程; 所述数据迁移装置包括: 访问频率获取单元,用于获取第一共享虚拟内存页面被每个片上内存分区 中处理器核集合的访问频率;所述第一共享虚拟内存页面为所述共享虚拟内存 空间中任一共享虚拟内存页面;所述处理器核集合的访问频率为归属于一个片 上内存分区中所有处理器核访问次数的总和;
迁移判断单元,用于判断所述第一共享虚拟内存页面被第二片上内存分区 中处理器核集合的访问频率比被所述第一共享虚拟内存页面对应的物理页面 所在的第一片上内存分区中的处理器核集合的访问频率是否高出第一预设阈 值;
数据迁移单元, 用于当所述迁移判断单元的判断结果为是时,将所述第一 共享虚拟内存页面对应的物理页面中的数据移出至所述第二片上内存分区中。
14、 根据权利要求 1 3所述的装置, 其特征在于, 所述的众核系统中将片上 内存分区中针对同属于一个应用程序的多个进程设置对应的页目录表,所述的 页目录表中记录有所述片上内存中物理页面与属于所述共享虚拟内存空间中 的共享虚拟内存页面之间的对应关系,及该共享虚拟内存页面被每个片上内存 分区中的处理器核集合的访问频率;
所述访问频率获取单元,具体用于通过查询所述页目录表获取所述第一共 享虚拟内存页面被每个片上内存分区中处理器核集合的访问频率;
所述装置还包括: 页目录更新单元, 用于在将所述第一共享虚拟内存页面 对应的物理页面中的数据移出至所述第二片上内存分区中之后,将所述第一共 享虚拟内存页面对应的物理页面更新为所述第二片上内存分区中用于存储所 述移出数据的物理页面。
15、 根据权利要求 14所述的装置, 其特征在于, 所述众核系统中还存储有 对应所述页目录表的页目录历史表,用于存储从所述页目录表中移出的共享虚 个片上内存中的处理器核集合的历史访问频率, 所述数据迁移装置, 还包括: 访问单元, 用于访问所述共享虚拟内存空间中的第二共享虚拟内存页面; 查找单元, 用于当众核系统中出现缺页错误时,在所述页目录表中查找是 否有与所述第二共享虚拟内存页面对应的物理页面;当在所述页目录表中没有 查找到与所述第二共享虚拟内存页面对应的物理页面,则在所述页目录历史表 中查找是否有所述第二共享虚拟内存页面;
所述页目录更新单元,还用于当在所述页目录历史表中查找到所述第二共 享虚拟内存页面,将所述第二共享虚拟内存页面从所述页目录历史表中移出至 所述页目录表中。
16、根据权利要求 15所述的装置,其特征在于,所述数据迁移装置还包括: 历史访问频率获取单元,用于在所述查找单元在所述页目录历史表中查找 到所述第二共享虚拟内存页面之后,从所述页目录历史表中获取所述第二共享 核集合的历史访问频率; 页面选择单元,用于根据所述历史访问频率获取单元所获取的第二共享虚 拟内存页面的历史访问频率从高到低的顺序,依次在所述片上内存分区中确定 是否有满足预设规则的目标物理页面, 直到获得目标物理页面; 所述数据迁移单元 ,还用于将所述第二共享虚拟内存页面对应的数据移出 至所述目标物理页面中;
所述页目录更新单元,还用于在所述页目录表中添加与所述第二共享虚拟 内存页面对应的目标物理页面。
17、 根据权利要求 15所述的装置, 其特征在于, 所述同属于应用程序的多 个进程中包括维护有所述共享虚拟内存空间的虚拟页面和物理页面之间对应 关系的第二进程; 所述装置还包括:
指令单元,用于在所述查找单元在在所述页目录历史表中查找到所述第二 共享虚拟内存页面之后, 向所述第二进程发送请求, 所述请求用于请求所述第 分区的所述处理核集合访问的历史访问频率从高到低的顺序,依次在所述片上 内存分区中确定是否有满足预设规则的目标物理页面, 直到获得目标物理页 面,将所述第二共享虚拟内存对应的数据移出至所述目标物理页面中; 在所述 页目录表中添加与所述第二共享虚拟内存页面对应的所述目标物理页面。
18、 根据权利要求 15所述的装置, 其特征在于, 所述装置还包括: 页面选择单元, 用于当所述查找单元在所述页目录历史表中没有查找到所述 第二共享虚拟内存页面, 则判断所述数据迁移装置所在的片上内存分区是否有满 足预设规则的目标物理页面; 如果有,通知所述数据迁移单元将所述第二共享虚拟内存页面对应的数据 移出到所述目标物理页面中;
如果否,则由近及远的判断所述数据迁移装置所在的片上内存分区临近的 片上内存分区中是否有满足预设规则的目标物理页面,直到查找到满足所述预 设规则的目标物理页面后,通知所述数据迁移单元将所述第二共享虚拟内存页 面对应的数据移出到所述目标物理页面中;
所述页目录更新单元,还用于将所述第二共享虚拟内存页面与所述目标物 理页面的对应关系放入所述页目录表中,在所述页目录表中记录所述第二共享 虚拟内存页面被每个片上内存分区中的处理器核集合的访问频率。
19、 根据权利要求 16或 18任一所述的装置, 其特征在于, 所述片上内存分 区中的每个物理页面具有在分区内的槽位号;所述满足预设规则的物理页面包 括: 满足预设规则的物理页面包括: 所述数据迁移装置根据所述第二共享虚拟页面地址获得索引值;
所述满足预设规则的物理页面为具有与所述索引值匹配的槽位号的,且空 闲的物理页面。
20、根据权利要求 19所述的装置,其特征在于,所述页面选择单元还用于, 当所述众核系统中所有片上内存分区中与具有与所述索引值匹配槽位号的物 理页面都不是空闲物理页面,在所述与具有与所述索引值匹配的槽位号的物理 页面中,选取一个其对应的虚拟共享内存页面的访问频率最 的物理页面作为 目标物理页面; 所述数据迁移单元,还用于在将所述第二共享虚拟内存页面对应的数据移 出至所述目标物理页面中之前,将所述目标物理页面的物理地址在所述页目录 表中原本对应的共享虚拟内存页面以及分别被所述片上内存分区中的处理器 核集合访问的频率移出至所述页目录历史表中,并将所述目标物理页面中原有 数据移出。
21、 根据权利要求 15所述的装置, 其特征在于, 所述页目录表由每个片上 内存分区中存储的子页目录表组成, 所述子页目录表中包括多个表项, 一个表 项记录所述子页目录表所在的片上内存分区中一个物理页面与该物理页面对 应的所述共享虚拟内存空间的共享虚拟内存页面之间的对应关系以及该共享 虚拟内存页面分别被所述每个片上内存分区中的处理器核集合的访问频率; 所述查找单元在所述页目录表中查找是否有与所述第二共享虚拟内存页 面对应的物理页面, 包括:
在所述数据迁移装置所在片上分区中的子页目录表中查找所述第二共享 虚拟内存页面,当在所述数据迁移装置所在的片上分区中的子页目录中没有查 找到所述第二共享虚拟内存页面, 则按照由近及远的原则,在所述众核系统中 的其余片上内存分区中的子页目录表中查找所述第二共享虚拟内存页面,直到 查找到所述第二共享虚拟内存页面或则已经查询了所述众核系统中的所有片 上内存分区上的子页目录分区。
22、 根据权利要求 19所述的装置, 其特征在于, 所述页目录表包括多个表 项 ,一个表项记录所述片上内存中一个物理页面与该物理页面对应的所述共享 虚拟内存空间的共享虚拟内存页面之间的对应关系以及该共享虚拟内存页面 分别被所述每个片上内存分区中的处理器核集合的访问频率;所述页目录表中 每个表项中物理页面的槽位号作为表项的索引值;
所述查找单元在所述页目录表中查找所述第二共享虚拟内存页面,具体包 括:
根据所述第二共享虚拟内存页面地址计算获得一个索引值;
根据所述表项的索引值,在所述页目录表中确定具有和所述获得的索引值 匹配的表项;判断所述匹配的表项中的共享虚拟内存页面与所述第二共享虚拟 内存页面是否相同。
23、 根据权利要求 15-18任一或 21所述装置, 其特征在于, 所述页目录历 史表中还记录有共享虚拟内存页面在所述页目录表中的时间段内被每个所述 片上内存分区的处理器核集合访问的频率的总和, 装置还包括:
表项丟弃单元,用于在所述页目录历史表剩余存储空间小于第二预设阈值 时,将所述页目录历史表中的每个共享虚拟内存页面对应的所述频率总和从低 到高的顺序,丟弃所述频率总和最低的预设数量的共享虚拟内存页面信息, 所 述共享虚拟内存页面信息包括共享虚拟内存页面,以及该共享虚拟内存页面被 每个所述片上内存分区中的处理器核集合的历史访问频率和所述历史访问频 率的总和。
24、 一种处理器, 其特征在于, 应用于众核系统中, 所述众核系统中包括 所述处理器, 所述处理器包括多个处理器核, 并配置有分布式片上内存, 所述 处理器中运行有多个属于同一应用程序的进程,所述进程间在片上内存中拥有 一段共享虚拟内存空间, 所述分布式片上内存被划分为多个片上内存分区,根 据就近原则,将多个处理器核在所述多个片上内存分区中进行分配, 所述处理 器包括: 处理器核, 存储器, 通信接口, 总线; 所述处理器核中运行有第一进 程, 所述第一进程为所述同属于一个应用程序中的多个进程中任一进程; 所述处理器核、 通信接口、存储器通过所述总线相互的通信; 所述通信接 口, 用于接收和发送数据; 所述存储器用于存储程序;
所述处理器核用于执行所述存储器中的所述程序, 执行如权利要求 1 -12任一所述的方法。
PCT/CN2013/091232 2013-12-31 2013-12-31 数据迁移方法、装置和处理器 WO2015100674A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201380002713.5A CN104956341A (zh) 2013-12-31 2013-12-31 数据迁移方法、装置和处理器
EP13900674.6A EP3062229A4 (en) 2013-12-31 2013-12-31 Data migration method, device and processor
PCT/CN2013/091232 WO2015100674A1 (zh) 2013-12-31 2013-12-31 数据迁移方法、装置和处理器
US15/197,358 US20160306741A1 (en) 2013-12-31 2016-06-29 Data Migration Method and Apparatus, and Processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2013/091232 WO2015100674A1 (zh) 2013-12-31 2013-12-31 数据迁移方法、装置和处理器

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/197,358 Continuation US20160306741A1 (en) 2013-12-31 2016-06-29 Data Migration Method and Apparatus, and Processor

Publications (1)

Publication Number Publication Date
WO2015100674A1 true WO2015100674A1 (zh) 2015-07-09

Family

ID=53493010

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/091232 WO2015100674A1 (zh) 2013-12-31 2013-12-31 数据迁移方法、装置和处理器

Country Status (4)

Country Link
US (1) US20160306741A1 (zh)
EP (1) EP3062229A4 (zh)
CN (1) CN104956341A (zh)
WO (1) WO2015100674A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107621927A (zh) * 2017-09-29 2018-01-23 南京宏海科技有限公司 一种基于超融合系统的纵向扩展方法及其装置
WO2023125285A1 (zh) * 2021-12-31 2023-07-06 华为技术有限公司 一种数据库系统更新方法及相关装置

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10606487B2 (en) * 2017-03-17 2020-03-31 International Business Machines Corporation Partitioned memory with locally aggregated copy pools
CN109901800B (zh) * 2019-03-14 2020-05-19 重庆大学 一种混合内存系统及其操作方法
US11620233B1 (en) * 2019-09-30 2023-04-04 Amazon Technologies, Inc. Memory data migration hardware
CN110795213B (zh) * 2019-12-12 2022-06-07 东北大学 一种虚拟机迁移过程中活跃内存预测迁移方法
CN111782559A (zh) * 2020-07-06 2020-10-16 Oppo广东移动通信有限公司 一种页面管理方法、装置及计算机可读存储介质
US20230033029A1 (en) * 2021-07-22 2023-02-02 Vmware, Inc. Optimized memory tiering
CN115344505B (zh) * 2022-08-01 2023-05-09 江苏华存电子科技有限公司 一种基于感知分类的内存访问方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101008923A (zh) * 2007-01-26 2007-08-01 浙江大学 面向异构多核体系的分段式存储空间管理方法
US20080172524A1 (en) * 2007-01-11 2008-07-17 Raza Microelectronics, Inc. Systems and methods for utilizing an extended translation look-aside buffer having a hybrid memory structure
CN102662638A (zh) * 2012-03-31 2012-09-12 北京理工大学 一种支持帮助线程预取距离参数的阈值边界选取方法
CN103077128A (zh) * 2012-12-29 2013-05-01 华中科技大学 一种多核环境下的共享缓存动态划分方法
CN103440225A (zh) * 2013-08-21 2013-12-11 复旦大学 一种可重构单指令多进程的多核处理器及方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3889044B2 (ja) * 1995-05-05 2007-03-07 シリコン、グラフィクス、インコーポレイテッド 不均一メモリ・アクセス(numa)システムにおけるページ移動
US6347362B1 (en) * 1998-12-29 2002-02-12 Intel Corporation Flexible event monitoring counters in multi-node processor systems and process of operating the same
US6766424B1 (en) * 1999-02-09 2004-07-20 Hewlett-Packard Development Company, L.P. Computer architecture with dynamic sub-page placement
CN1560745A (zh) * 2004-02-25 2005-01-05 中国人民解放军国防科学技术大学 基于即态访问信息的动态页迁移方法
US8037465B2 (en) * 2005-09-30 2011-10-11 Intel Corporation Thread-data affinity optimization using compiler
US9753831B2 (en) * 2012-05-30 2017-09-05 Red Hat Israel, Ltd. Optimization of operating system and virtual machine monitor memory management

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080172524A1 (en) * 2007-01-11 2008-07-17 Raza Microelectronics, Inc. Systems and methods for utilizing an extended translation look-aside buffer having a hybrid memory structure
CN101008923A (zh) * 2007-01-26 2007-08-01 浙江大学 面向异构多核体系的分段式存储空间管理方法
CN102662638A (zh) * 2012-03-31 2012-09-12 北京理工大学 一种支持帮助线程预取距离参数的阈值边界选取方法
CN103077128A (zh) * 2012-12-29 2013-05-01 华中科技大学 一种多核环境下的共享缓存动态划分方法
CN103440225A (zh) * 2013-08-21 2013-12-11 复旦大学 一种可重构单指令多进程的多核处理器及方法

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107621927A (zh) * 2017-09-29 2018-01-23 南京宏海科技有限公司 一种基于超融合系统的纵向扩展方法及其装置
CN107621927B (zh) * 2017-09-29 2020-08-14 南京宏海科技有限公司 一种基于超融合系统的纵向扩展方法及其装置
WO2023125285A1 (zh) * 2021-12-31 2023-07-06 华为技术有限公司 一种数据库系统更新方法及相关装置

Also Published As

Publication number Publication date
CN104956341A (zh) 2015-09-30
EP3062229A1 (en) 2016-08-31
US20160306741A1 (en) 2016-10-20
EP3062229A4 (en) 2017-01-25

Similar Documents

Publication Publication Date Title
WO2015100674A1 (zh) 数据迁移方法、装置和处理器
US10013317B1 (en) Restoring a volume in a storage system
US20170195282A1 (en) Address Processing Method, Related Device, and System
US7827178B2 (en) File server for performing cache prefetching in cooperation with search AP
JP2004171547A (ja) メモリシステムを管理する方法および装置
WO2011107046A2 (zh) 内存访问监测方法和装置
US11809382B2 (en) System and method for supporting versioned objects
JP6188607B2 (ja) インデクスツリーの探索方法及び計算機
JP5817558B2 (ja) 情報処理装置、分散処理システム、キャッシュ管理プログラムおよび分散処理方法
US20220156243A1 (en) Method, device, and computer program product for managing storage system
US20240012813A1 (en) Dynamic prefetching for database queries
Min et al. Vmmb: Virtual machine memory balancing for unmodified operating systems
US10031777B2 (en) Method and system for scheduling virtual machines in integrated virtual machine clusters
CN117033831A (zh) 一种客户端缓存方法、装置及其介质
CN116594562A (zh) 一种数据处理方法及装置、设备、存储介质
Zhang et al. Redis++: A high performance in-memory database based on segmented memory management and two-level hash index
Shu et al. Accelerating big data applications on tiered storage system with various eviction policies
Khan et al. On smart query routing: for distributed graph querying with decoupled storage
KR102054068B1 (ko) 그래프 스트림에 대한 실시간 분산 저장을 위한 분할 방법 및 분할 장치
Cheng et al. Improving LSM‐trie performance by parallel search
CN112445794A (zh) 一种大数据系统的缓存方法
Liang et al. Correlation-aware replica prefetching strategy to decrease access latency in edge cloud
CN117539915B (zh) 一种数据处理方法及相关装置
Gokhale et al. {KVZone} and the Search for a {Write-Optimized}{Key-Value} Store
Van Hung et al. An effective data placement strategy in main-memory database cluster

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13900674

Country of ref document: EP

Kind code of ref document: A1

REEP Request for entry into the european phase

Ref document number: 2013900674

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2013900674

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE