CN106133704A - Memory failure partition method and device - Google Patents

Memory failure partition method and device Download PDF

Info

Publication number
CN106133704A
CN106133704A CN201580011928.2A CN201580011928A CN106133704A CN 106133704 A CN106133704 A CN 106133704A CN 201580011928 A CN201580011928 A CN 201580011928A CN 106133704 A CN106133704 A CN 106133704A
Authority
CN
China
Prior art keywords
physical address
internal memory
memory
address block
belonging
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201580011928.2A
Other languages
Chinese (zh)
Inventor
刘勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN106133704A publication Critical patent/CN106133704A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation

Abstract

A kind of memory failure partition method and device, in process running, the state of the internal memory that the original physical address that the virtual address of this process of monitoring server is mapped is identified, if the internal memory that original physical address is identified breaks down, then the status indication of the physical address block belonging to original physical address is fault by server, to realize online isolated fault memory headroom.And server is the page allocated physical address block belonging to virtual address, in the data syn-chronization in the internal memory interval that the physical address block belonging to original physical address is identified to the internal memory interval that the physical address block redistributed is identified.Said method, during whole, the virtual address of this process is constant, thus ensures that business is not interrupted, it is achieved isolate failure memory space online.

Description

Memory failure partition method and device Technical field
The present embodiments relate to computer technology, more particularly to a kind of memory failure partition method and device.
Background technology
Memory failure, which occurs, for server can cause server or board resetting, and server, which resets, causes the application being currently running to be interrupted, and need server returning to manufacturer's replacing internal memory, change internal memory inconvenient.
Prior art provides a kind of offline failure memory partition method, it is not necessary to returns to manufacturer and changes internal memory, this method includes:Before server operation, pass through basic input output system (Basic Input Output System first, abbreviation BIOS) internally deposit into capable detection, and obtain the address information in failure memory space, the address information in failure memory space is saved in nonvolatile memory (Non-Volatile Memory, abbreviation NVM) in, read the address information in the failure memory space preserved in NVM, it is unavailable by the corresponding failure memory free token of the address information, forever isolates failure memory space.
The method that above-mentioned prior art is provided, can only carry out Fault Isolation before server operation, internal memory hardware fault occur during server is run, can still result in service disconnection.
The content of the invention
The embodiment of the present invention provides a kind of memory failure partition method and device, can carry out Fault Isolation during server is run, it is to avoid service disconnection.
First aspect present invention provides a kind of memory failure isolating device, including:
Exception processing module, the state for the internal memory that the original physical address that virtual address for monitoring process is mapped is identified, wherein, there are mapping relations in the page belonging to the virtual address, the physical address block is used to identify one section of contiguous memory interval for distributing to the process with the physical address block belonging to the original physical address;
If the internal memory that the original physical address is identified breaks down, it is failure that the exception processing module, which is additionally operable to the status indication of the physical address block belonging to the original physical address,;
Memory management module, for being the page block weight belonging to the virtual address from non-faulting internal memory New allocated physical address block;
The exception processing module, be additionally operable to the internal memory that is identified the physical address block belonging to the original physical address it is interval in the internal memory that is identified to the physical address block redistributed of data syn-chronization it is interval in.
With reference to first aspect present invention, in the first possible implementation of first aspect present invention, the exception processing module is additionally operable to:The information for marking the physical address block for being is saved in nonvolatile storage;
The memory management module specifically for:
According to the information of the faulty physical address block preserved in the nonvolatile storage, the non-faulting internal memory is determined;
It is that the page belonging to the virtual address redistributes physical address block from the non-faulting internal memory according to the process number of the virtual address and the process.
With reference to the first possible implementation of first aspect present invention, in second of possible implementation of first aspect present invention, the memory management module specifically for:
The page belonging to stating virtual address is obtained according to the process number of the virtual address and the process;
Physical address block is chosen from the non-faulting internal memory, the page belonging to the virtual address is set up to the mapping relations of the physical address block of the selection.
With reference to first aspect present invention, in the third possible implementation of first aspect present invention, the memory management module is additionally operable to:
When for the course allocation initial memory, the information of the faulty physical address block preserved in nonvolatile storage is read;
According to the information of the faulty physical address block, determine to save as the internal memory in the internal memory in addition to the faulty physical address block in non-faulting internal memory, the non-faulting;
It is initial memory described in the course allocation from the non-faulting internal memory.
With reference to the first any one into the third possible implementation of first aspect present invention and first aspect present invention, in the 4th kind of possible implementation of first aspect present invention, the exception processing module is additionally operable to:
When server restarts, fault detect is carried out to the internal memory;
It is non-faulting by the status indication of the physical address block belonging to the original physical address if the physical address block detected belonging to the original physical address is recovered normal.
Second aspect of the present invention provides a kind of memory failure partition method, including:
The state for the internal memory that the original physical address that the virtual address of monitoring process is mapped is identified, wherein, there are mapping relations in the page belonging to the virtual address, the physical address block is used to identify one section of contiguous memory interval for distributing to the process with the physical address block belonging to the original physical address;
It is failure by the status indication of the physical address block belonging to the original physical address if the internal memory that the original physical address is identified breaks down;
Be that page block belonging to the virtual address redistributes physical address block from non-faulting internal memory, and the internal memory that the physical address block belonging to the original physical address is identified it is interval in the internal memory that is identified to the physical address block redistributed of data syn-chronization it is interval in.
With reference to second aspect of the present invention, in the first possible implementation of second aspect of the present invention, methods described also includes:
The information for marking the physical address block for being is saved in nonvolatile storage;
It is that the page belonging to the virtual address redistributes physical address block in the internal memory from non-faulting, including:
According to the information of the faulty physical address block preserved in the nonvolatile storage, the non-faulting internal memory is determined;
It is that the page belonging to the virtual address redistributes physical address block from the non-faulting internal memory according to the process number of the virtual address and the process.
With reference to the first possible implementation of second aspect of the present invention, in second of possible implementation of second aspect of the present invention, the process number according to the virtual address and the process, it is that the page belonging to the virtual address redistributes physical address block from the non-faulting internal memory, including:
The page belonging to stating virtual address is obtained according to the process number of the virtual address and the process;
Physical address block is chosen from the non-faulting internal memory, the page belonging to the virtual address is set up to the mapping relations of the physical address block of the selection.
With reference to second aspect of the present invention, in the third possible implementation of second aspect of the present invention, methods described also includes:
When for the course allocation initial memory, the information of the faulty physical address block preserved in nonvolatile storage is read;
According to the information of the faulty physical address block, non-faulting internal memory, the non-faulting internal memory are determined For the internal memory in the internal memory in addition to the faulty physical address block;
It is initial memory described in the course allocation from the non-faulting internal memory.
With reference to the first any one into the third possible implementation of second aspect of the present invention and second aspect of the present invention, in the 4th kind of possible implementation of second aspect of the present invention, methods described also includes:
When server restarts, fault detect is carried out to the internal memory;
It is non-faulting by the status indication of the physical address block belonging to the original physical address if the physical address block detected belonging to the original physical address is recovered normal.
The memory failure partition method and device of the embodiment of the present invention, in process running, the state for the internal memory that the original physical address that the virtual address of the monitoring server process is mapped is identified, if the internal memory that original physical address is identified breaks down, then the status indication of the physical address block belonging to original physical address is failure by server, to realize online isolated fault memory headroom.And server is the page allocated physical address block belonging to virtual address, during the internal memory that the data syn-chronization during the internal memory that the physical address block belonging to original physical address is identified is interval is identified to the physical address block redistributed is interval.The method of the present embodiment, the virtual address of the process is constant in whole process, so as to ensure that business is not interrupted, realization is isolated to failure memory space online.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, the required accompanying drawing used in embodiment or description of the prior art will be briefly described below, apparently, drawings in the following description are some embodiments of the present invention, for those of ordinary skill in the art, without having to pay creative labor, other accompanying drawings can also be obtained according to these accompanying drawings.
Fig. 1 is a kind of structural representation for server that the embodiment of the present invention is applicable;
Fig. 2 is a kind of structural representation for memory failure isolating device that the embodiment of the present invention one is provided;
Fig. 3 is a kind of flow chart for memory failure partition method that the embodiment of the present invention two is provided.
Embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely retouched State, it is clear that described embodiment is a part of embodiment of the invention, rather than whole embodiments.Based on the embodiment in the present invention, the every other embodiment that those of ordinary skill in the art are obtained under the premise of creative work is not made belongs to the scope of protection of the invention.
The method of the embodiment of the present invention is mainly used in the operating system nucleus and VMM kernels with page table mapping mechanism and exception handling.Fig. 1 is a kind of structural representation for server that the embodiment of the present invention is applicable, as shown in figure 1, server includes:Internal memory, processor and NVM, processor refer mainly to CPU (Central Processing Unit, abbreviation CPU), and processor includes memory management module and exception processing module.Memory management module:Realize virtual machine or process virtual address space and physical address space mapping.Exception processing module is used to be used to handle memory failure in the abnormal progress relevant treatment for triggering CPU, the embodiment of the present invention.Internal memory and NVM are two single physical hardwares, and NVM is used for the physical address for storing failure memory space, even if under server after electricity, the data preserved in NVM will not also lose.Internal memory is generally random access memory (Random Access Memory, abbreviation RAM) or dynamic random access memory (Dynamic Random Access Memory, abbreviation DRAM), internal memory is general to be existed with internal memory bar shaped state, the size of single memory bar is in 8GB in the internal memory of server, 16GB and more than, replace that cost is higher, virtual machine or process are that service operation carries form.
Fig. 2 be the embodiment of the present invention one provide a kind of memory failure isolating device structural representation, the present embodiment provide memory failure isolating device can with it is integrated in the server, as shown in Fig. 2 the present embodiment provide memory failure isolating device include:Exception processing module 11 and memory management module 12.
Wherein, exception processing module 11, the state for the internal memory that the original physical address that virtual address for monitoring process is mapped is identified, wherein, there are mapping relations in the page belonging to virtual address, physical address block is used for one section of contiguous memory interval for identifying process of distributing to the physical address block belonging to original physical address.
If the internal memory that original physical address is identified breaks down, it is failure that exception processing module 11, which is additionally operable to the status indication of the physical address block belonging to original physical address,.
Memory management module 12, physical address block is redistributed for the page block from non-faulting internal memory belonging to virtual address;
Exception processing module 11, be additionally operable to the internal memory that is identified the physical address block belonging to original physical address it is interval in the internal memory that is identified to the physical address block redistributed of data syn-chronization it is interval in.
When user's startup program, the operating system of server creates process, internal memory for the application program Management module 12 is additionally operable to as the course allocation initial memory space, the as course allocation page, in page table mapping mechanism, each page includes one section of virtual address interval, one physical address block of each page correspondence, each virtual address that each physical address block is used to identify in one section of contiguous memory interval, the page is corresponded with the physical address in physical address block, and the mapping relations of virtual address and physical address are safeguarded by the memory management module 12 of server.
In the present embodiment, memory management module 12 is that the course allocation initial memory is specially:The information of the faulty physical address block preserved in NVM is read first, then, is determined to save as the internal memory in internal memory in addition to faulty physical address block in non-faulting internal memory, non-faulting according to the information of faulty physical address block;Finally, it is the course allocation initial memory from non-faulting internal memory.In the present embodiment, memory management module 12 realizes the mapping of virtual address and physical address in the form of page table, page table organizational form is managed with 1 grade or multi-level form, for example in linux kernel using 3 grades of page table management modes, the size of each page can be 4K, 2M or 1gbps etc., the present embodiment, not the mode of management to page table limit.
Memory management module 12 when for the course allocation initial memory, can avoid the internal memory that faulty physical address block is identified by reading the information of the faulty physical address block in NVM.During the process follow-up operation, the state for the internal memory that the original physical address that the virtual address that exception processing module 11 monitors the process is mapped is identified, the state for the internal memory that original physical address is identified includes:Malfunction and normal condition.Specifically, memory management module 12 receives the memory access request that the process is sent, the memory access request includes the virtual address of the process, the virtual address is mapped as original physical address by memory management module 12, and arrive the corresponding relation storage of virtual address and original physical address in page table cache (Translation Look-aside Buffer, abbreviation TLB).Then, original physical address is sent to Memory Controller Hub by memory management module 12 by rambus, Memory Controller Hub is according to original physical address reading data, if the data in the internal memory that original physical address is identified can not be read by exception occur, then Memory Controller Hub sends an abnormal access by rambus and instructed, and exception processing module 11 is instructed according to the abnormal access determines that the internal memory that original physical address is identified breaks down.
Minimum operation unit is generally used as using a page table in page table mapping mechanism, and page table one physical address block of correspondence, therefore, when the internal memory that original physical address is identified breaks down, the status indication of physical address block belonging to original physical address is failure by exception processing module 11, and the physical address block belonging to original physical address is isolated.The information of usual faulty physical address block is recorded in NVM, accordingly even when the information of the faulty physical address block preserved in server power failure, NVM It will not lose, after the power-up, memory management module 12 can still read the information of faulty physical address block to server from NVM, when for course allocation initial memory, the internal memory interval that faulty physical address block is identified can be avoided.
In the present embodiment, in order to ensure that the business being currently running is not interrupted, memory management module 12 is that the page belonging to the virtual address redistributes physical address block, the physical address that the virtual address is mapped before and after distribution is changed, still, for upper layer application, this does not change using the virtual address of corresponding process, as long as virtual address is constant, process would not be interrupted, so as to ensure that the business of user is not interrupted.
Memory management module 12 redistributes physical address block especially by the page of the following manner belonging to virtual address:First, according to the information of the faulty physical address block preserved in NVM, determine to save as the internal memory in internal memory in addition to the internal memory that faulty physical address block is identified in non-faulting internal memory, non-faulting.Then, according to the process number of virtual address and process, the page from non-faulting internal memory belonging to virtual address redistributes physical address block.Wherein, memory management module 12 is according to the process number of virtual address and process, and the page from non-faulting internal memory belonging to virtual address redistributes physical address block, is specially:First, the page according to belonging to the process number of virtual address and process obtains virtual address;Then, physical address block is chosen from non-faulting internal memory, the page belonging to virtual address is set up to the mapping relations for the physical address block chosen.
Memory management module 12 after physical address block has been redistributed for the page block belonging to virtual address, exception processing module 11 be additionally operable to the internal memory that is identified the physical address block belonging to original physical address it is interval in the internal memory that is identified to the physical address block redistributed of data syn-chronization it is interval in.Specifically, when the internal memory that original physical address is identified breaks down, Memory Controller Hub can produce exceptional instructions, exception processing module 11 carries out data recovery according to abnormal access instruction, command code and operand are included in abnormal access instruction, the action type for operating representation abnormal access instruction is read operation or write operation, and operand includes the information of read operation or the write operation register to be accessed and the physical address of read operation or the write operation data to be accessed.If action type is write operation, then the correspondence position in the physical address block that exception processing module 11 redistributes the data to be written write-in for needing to write in the internal memory that original physical address is identified.If action type is read operation, the Backup Data of data to be read in the internal memory that so exception processing module 11 can be identified according to original physical address carries out data recovery to data to be read, if data to be read can be recovered according to Backup Data, then the data duplication to be read for obtaining recovery to the correspondence position in the physical address block redistributed.If continuing Access evidence can not be recovered, then exception processing module 11 resets to process.It is different with prior art, in the present embodiment, it is only necessary to which the currently monitored process is resetted, without other processes run in interrupt the server.In the prior art, once some process memory failure occurs, it is necessary to be resetted to server in the process of running, all processes run on server can all be interrupted, and cause all business all to be interrupted.
It should be noted that, in the present embodiment, although being failure by the status indication of original physical address block, but in the internal memory interval that actually original physical address block is identified, the internal memory that only original physical address is identified breaks down, other internal memories in the internal memory interval that original physical address block is identified are normal, therefore, when the internal memory that data syn-chronization in the internal memory interval that exception processing module 11 is identified the physical address block belonging to original physical address is identified to the physical address block redistributed is interval, copied in the internal memory interval that directly can be identified for the data in normal internal memory from original physical address block in the internal memory interval that the physical address block redistributed is identified.
Optionally, if server restarts, exception processing module 11 is additionally operable to internally deposit into row fault detect, whether the failure memory block in specific detection NVM is recovered normal, if the physical address block detected belonging to original physical address is recovered normal, then the status indication of the physical address block belonging to original physical address is non-faulting by exception processing module 11, and the internal memory interval that original physical address block is identified can be used for Memory Allocation.If the physical address block belonging to original physical address fails to recover, the permanent sequestration original physical address block of exception processing module 11, the internal memory interval that original physical address block is identified cannot be used for Memory Allocation.
The device of the present embodiment, in process running, the state for the internal memory that the original physical address that the virtual address that exception processing module monitors the process is mapped is identified, if the internal memory that original physical address is identified breaks down, then the status indication of the physical address block belonging to original physical address is failure by exception processing module, to realize online isolated fault memory headroom.And exception processing module calls page allocated physical address block of the memory management module belonging to virtual address, while in the internal memory interval that the data syn-chronization during the internal memory that the physical address block belonging to original physical address is identified is interval is identified to the physical address block redistributed.The device of the present embodiment, the virtual address of the process is constant in whole process, so as to ensure that business is not interrupted, realization is isolated to failure memory space online.
Fig. 3 is a kind of flow chart for memory failure partition method that the embodiment of the present invention two is provided, and the method for the present embodiment is performed by server, as shown in Fig. 2 the method for the present embodiment may comprise steps of:
The state for the internal memory that the original physical address that step 101, the virtual address of monitoring process are mapped is identified, wherein, there are mapping relations in the page belonging to virtual address, physical address block is used for one section of contiguous memory interval for identifying process of distributing to the physical address block belonging to original physical address.
When user's startup program, the operating system of server creates process for the application program, and server is the course allocation initial memory space.Server is that the course allocation initial memory is specially:The information of the faulty physical address block preserved in NVM is read first, then, is determined to save as the internal memory in internal memory in addition to faulty physical address block in non-faulting internal memory, non-faulting according to the information of faulty physical address block;Finally, it is the course allocation initial memory from non-faulting internal memory.In the present embodiment, memory management module realizes the mapping of virtual address and physical address in the form of page table, page table organizational form is managed with 1 grade or multi-level form, for example in linux kernel using 3 grades of page table management modes, the size of each page can be 4K, 2M or 1gbps etc., the present embodiment, not the mode of management to page table limit.
In the present embodiment, server when for the course allocation initial memory, can avoid the internal memory that faulty physical address block is identified by reading the information of the faulty physical address block in NVM.During the process follow-up operation, the state for the internal memory that the original physical address that the virtual address of the monitoring server process is mapped is identified, the state for the internal memory that original physical address is identified includes:Malfunction and normal condition.Specifically, the memory management module of server receives the memory access request of process transmission, the memory access request includes the virtual address of the process, and the virtual address is mapped as original physical address by memory management module, and the corresponding relation of virtual address and original physical address is stored into TLB.Then, original physical address is sent to Memory Controller Hub by memory management module by rambus, Memory Controller Hub is according to original physical address reading data, if the data in the internal memory that original physical address is identified can not be read by exception occur, then Memory Controller Hub sends an abnormal access by rambus and instructed, and server is instructed according to the abnormal access determines that the internal memory that original physical address is identified breaks down.
It is failure by the status indication of the physical address block belonging to original physical address if the internal memory that step 102, original physical address are identified breaks down.
In the present embodiment, when the internal memory that original physical address is identified breaks down, the status indication of the physical address block belonging to original physical address is failure by server, and the physical address block belonging to original physical address is isolated.The information of usual faulty physical address block is recorded in NVM, accordingly even when the information of the faulty physical address block preserved in server power failure, NVM will not also lose, server is after the power-up, the information of faulty physical address block can be still read from NVM, at the beginning of course allocation During beginning internal memory, the internal memory interval that faulty physical address block is identified can be avoided.
Step 103, the page from non-faulting internal memory belonging to virtual address are redistributed in the internal memory interval that the data syn-chronization in physical address block, and the internal memory interval that the physical address block belonging to original physical address is identified is identified to the physical address block redistributed.
In the present embodiment, in order to ensure that the business being currently running is not interrupted, server is that the page belonging to the virtual address redistributes physical address block, the physical address that the virtual address is mapped before and after distribution is changed, still, for upper layer application, this does not change using the virtual address of corresponding process, as long as virtual address is constant, process would not be interrupted, so as to ensure that the business of user is not interrupted.
Specifically, the page of the server in the following way belonging to virtual address redistributes physical address block:First, according to the information of the faulty physical address block preserved in NVM, determine to save as the internal memory in internal memory in addition to the internal memory that faulty physical address block is identified in non-faulting internal memory, non-faulting.Then, according to the process number of virtual address and process, the page from non-faulting internal memory belonging to virtual address redistributes physical address block.Wherein, server is according to the process number of virtual address and process, and the page from non-faulting internal memory belonging to virtual address redistributes physical address block, is specially:First, the page according to belonging to the process number of virtual address and process obtains virtual address;Then, physical address block is chosen from non-faulting internal memory, the page belonging to virtual address is set up to the mapping relations for the physical address block chosen.
Server is after physical address block has been redistributed for the page block belonging to virtual address, in the internal memory interval that the data syn-chronization in the internal memory interval that the physical address block belonging to original physical address is identified is identified to the physical address block redistributed.The specific method of synchronization refer to the associated description of embodiment one, repeat no more here.
The method of the present embodiment, in process running, the state for the internal memory that the original physical address that the virtual address of the monitoring server process is mapped is identified, if the internal memory that original physical address is identified breaks down, then the status indication of the physical address block belonging to original physical address is failure by server, to realize online isolated fault memory headroom.And server is the page allocated physical address block belonging to virtual address, during the internal memory that the data syn-chronization during the internal memory that the physical address block belonging to original physical address is identified is interval is identified to the physical address block redistributed is interval.The method of the present embodiment, the virtual address of the process is constant in whole process, so as to ensure that business is not interrupted, realization is isolated to failure memory space online.
On the basis of embodiment two, if server restarts, server internally deposits into row fault detect, whether the failure memory block serviced in implement body detection NVM is recovered normal, if the physical address block that server is detected belonging to original physical address is recovered normal, then the status indication of the physical address block belonging to original physical address is non-faulting by server, and the internal memory interval that original physical address block is identified can be used for Memory Allocation.If the physical address block belonging to original physical address fails to recover, server permanent sequestration original physical address block, the internal memory interval that original physical address block is identified cannot be used for Memory Allocation.
One of ordinary skill in the art will appreciate that:Realizing all or part of step of above method embodiment can be completed by the related hardware of programmed instruction, and foregoing program can be stored in a computer read/write memory medium, and the program upon execution, performs the step of including above method embodiment;And foregoing storage medium includes:ROM, RAM, magnetic disc or CD etc. are various can be with the medium of store program codes.
Finally it should be noted that:Various embodiments above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although the present invention is described in detail with reference to foregoing embodiments, it will be understood by those within the art that:It can still modify to the technical scheme described in foregoing embodiments, or carry out equivalent substitution to which part or all technical characteristic;And these modifications or replacement, the essence of appropriate technical solution is departed from the scope of various embodiments of the present invention technical scheme.

Claims (10)

  1. A kind of memory failure isolating device, it is characterised in that including:
    Exception processing module, the state for the internal memory that the original physical address that virtual address for monitoring process is mapped is identified, wherein, there are mapping relations in the page belonging to the virtual address, the physical address block is used to identify one section of contiguous memory interval for distributing to the process with the physical address block belonging to the original physical address;
    If the internal memory that the original physical address is identified breaks down, it is failure that the exception processing module, which is additionally operable to the status indication of the physical address block belonging to the original physical address,;
    Memory management module, for being that the page block belonging to the virtual address redistributes physical address block from non-faulting internal memory;
    The exception processing module, be additionally operable to the internal memory that is identified the physical address block belonging to the original physical address it is interval in the internal memory that is identified to the physical address block redistributed of data syn-chronization it is interval in.
  2. Device according to claim 1, it is characterised in that the exception processing module is additionally operable to:The information for marking the physical address block for being is saved in nonvolatile storage;
    The memory management module specifically for:
    According to the information of the faulty physical address block preserved in the nonvolatile storage, the non-faulting internal memory is determined;
    It is that the page belonging to the virtual address redistributes physical address block from the non-faulting internal memory according to the process number of the virtual address and the process.
  3. Device according to claim 2, it is characterised in that the memory management module specifically for:
    The page belonging to stating virtual address is obtained according to the process number of the virtual address and the process;
    Physical address block is chosen from the non-faulting internal memory, the page belonging to the virtual address is set up to the mapping relations of the physical address block of the selection.
  4. Device according to claim 1, it is characterised in that the memory management module is additionally operable to:
    When for the course allocation initial memory, the information of the faulty physical address block preserved in nonvolatile storage is read;
    According to the information of the faulty physical address block, non-faulting internal memory, the non-faulting internal memory are determined For the internal memory in the internal memory in addition to the faulty physical address block;
    It is initial memory described in the course allocation from the non-faulting internal memory.
  5. Device according to any one of claim 1-4, it is characterised in that the exception processing module is additionally operable to:
    When server restarts, fault detect is carried out to the internal memory;
    It is non-faulting by the status indication of the physical address block belonging to the original physical address if the physical address block detected belonging to the original physical address is recovered normal.
  6. A kind of memory failure partition method, it is characterised in that including:
    The state for the internal memory that the original physical address that the virtual address of monitoring process is mapped is identified, wherein, there are mapping relations in the page belonging to the virtual address, the physical address block is used to identify one section of contiguous memory interval for distributing to the process with the physical address block belonging to the original physical address;
    It is failure by the status indication of the physical address block belonging to the original physical address if the internal memory that the original physical address is identified breaks down;
    Be that page block belonging to the virtual address redistributes physical address block from non-faulting internal memory, and the internal memory that the physical address block belonging to the original physical address is identified it is interval in the internal memory that is identified to the physical address block redistributed of data syn-chronization it is interval in.
  7. Method according to claim 6, it is characterised in that methods described also includes:
    The information for marking the physical address block for being is saved in nonvolatile storage;
    It is that the page belonging to the virtual address redistributes physical address block in the internal memory from non-faulting, including:
    According to the information of the faulty physical address block preserved in the nonvolatile storage, the non-faulting internal memory is determined;
    It is that the page belonging to the virtual address redistributes physical address block from the non-faulting internal memory according to the process number of the virtual address and the process.
  8. Method according to claim 7, it is characterised in that the process number according to the virtual address and the process, is that the page belonging to the virtual address redistributes physical address block from the non-faulting internal memory, including:
    The page belonging to stating virtual address is obtained according to the process number of the virtual address and the process;
    Physical address block is chosen from the non-faulting internal memory, the page belonging to the virtual address is set up to the mapping relations of the physical address block of the selection.
  9. Method according to claim 6, it is characterised in that methods described also includes:
    When for the course allocation initial memory, the information of the faulty physical address block preserved in nonvolatile storage is read;
    According to the information of the faulty physical address block, determine to save as the internal memory in the internal memory in addition to the faulty physical address block in non-faulting internal memory, the non-faulting;
    It is initial memory described in the course allocation from the non-faulting internal memory.
  10. Method according to any one of claim 6-9, it is characterised in that methods described also includes:
    When server restarts, fault detect is carried out to the internal memory;
    It is non-faulting by the status indication of the physical address block belonging to the original physical address if the physical address block detected belonging to the original physical address is recovered normal.
CN201580011928.2A 2015-01-19 2015-01-19 Memory failure partition method and device Pending CN106133704A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/071008 WO2016115661A1 (en) 2015-01-19 2015-01-19 Memory fault isolation method and device

Publications (1)

Publication Number Publication Date
CN106133704A true CN106133704A (en) 2016-11-16

Family

ID=56416247

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201580011928.2A Pending CN106133704A (en) 2015-01-19 2015-01-19 Memory failure partition method and device

Country Status (2)

Country Link
CN (1) CN106133704A (en)
WO (1) WO2016115661A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108579093A (en) * 2018-05-10 2018-09-28 腾讯科技(上海)有限公司 The running protection method, apparatus and readable medium of target process
CN109522122A (en) * 2018-11-14 2019-03-26 郑州云海信息技术有限公司 A kind of EMS memory management process, system, device and computer readable storage medium
CN109753378A (en) * 2019-01-02 2019-05-14 浪潮商用机器有限公司 A kind of partition method of memory failure, device, system and readable storage medium storing program for executing
CN110858167A (en) * 2018-08-22 2020-03-03 阿里巴巴集团控股有限公司 Memory fault isolation method, device and equipment
CN114780473A (en) * 2022-05-18 2022-07-22 长鑫存储技术有限公司 Memory bank hot plug method and device and memory bank
CN115686901A (en) * 2022-10-25 2023-02-03 超聚变数字技术有限公司 Memory fault analysis method and computer equipment
WO2023193396A1 (en) * 2022-04-08 2023-10-12 苏州浪潮智能科技有限公司 Memory fault processing method and device, and computer readable storage medium

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110532124A (en) * 2019-09-06 2019-12-03 西安易朴通讯技术有限公司 Memory partition method and device
CN113495799B (en) * 2020-03-20 2024-04-12 华为技术有限公司 Memory fault processing method and related equipment
CN113515405A (en) * 2021-07-09 2021-10-19 维沃移动通信有限公司 Address management method and device
CN114020525B (en) * 2021-10-21 2024-04-19 苏州浪潮智能科技有限公司 Fault isolation method, device, equipment and storage medium
CN115617274A (en) * 2022-10-27 2023-01-17 亿铸科技(杭州)有限责任公司 Memory computing device with bad block management function and operation method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103064804A (en) * 2012-12-13 2013-04-24 华为技术有限公司 Method and device for access control of disk data
CN103631721A (en) * 2012-08-23 2014-03-12 华为技术有限公司 Method and system for isolating bad blocks in internal storage
CN103778065A (en) * 2012-10-25 2014-05-07 北京兆易创新科技股份有限公司 Flash memory and bad block managing method thereof

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090307563A1 (en) * 2008-06-05 2009-12-10 Ibm Corporation (Almaden Research Center) Replacing bad hard drive sectors using mram
CN102541676B (en) * 2011-12-22 2014-03-05 福建新大陆通信科技股份有限公司 Method for detecting and mapping states of NAND FLASH
CN103186471B (en) * 2011-12-30 2016-10-12 深圳市共进电子股份有限公司 The management method of bad block and system in storage device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103631721A (en) * 2012-08-23 2014-03-12 华为技术有限公司 Method and system for isolating bad blocks in internal storage
CN103778065A (en) * 2012-10-25 2014-05-07 北京兆易创新科技股份有限公司 Flash memory and bad block managing method thereof
CN103064804A (en) * 2012-12-13 2013-04-24 华为技术有限公司 Method and device for access control of disk data

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108579093A (en) * 2018-05-10 2018-09-28 腾讯科技(上海)有限公司 The running protection method, apparatus and readable medium of target process
CN108579093B (en) * 2018-05-10 2023-11-03 腾讯科技(上海)有限公司 Method, device and readable medium for protecting operation of target process
CN110858167A (en) * 2018-08-22 2020-03-03 阿里巴巴集团控股有限公司 Memory fault isolation method, device and equipment
CN110858167B (en) * 2018-08-22 2023-06-27 阿里巴巴集团控股有限公司 Memory fault isolation method, device and equipment
CN109522122A (en) * 2018-11-14 2019-03-26 郑州云海信息技术有限公司 A kind of EMS memory management process, system, device and computer readable storage medium
CN109522122B (en) * 2018-11-14 2021-12-17 郑州云海信息技术有限公司 Memory management method, system, device and computer readable storage medium
CN109753378A (en) * 2019-01-02 2019-05-14 浪潮商用机器有限公司 A kind of partition method of memory failure, device, system and readable storage medium storing program for executing
WO2023193396A1 (en) * 2022-04-08 2023-10-12 苏州浪潮智能科技有限公司 Memory fault processing method and device, and computer readable storage medium
CN114780473A (en) * 2022-05-18 2022-07-22 长鑫存储技术有限公司 Memory bank hot plug method and device and memory bank
CN115686901A (en) * 2022-10-25 2023-02-03 超聚变数字技术有限公司 Memory fault analysis method and computer equipment
CN115686901B (en) * 2022-10-25 2023-08-04 超聚变数字技术有限公司 Memory fault analysis method and computer equipment

Also Published As

Publication number Publication date
WO2016115661A1 (en) 2016-07-28

Similar Documents

Publication Publication Date Title
CN106133704A (en) Memory failure partition method and device
JP6529617B2 (en) Selective retention of application program data to be migrated from system memory to non-volatile data storage
US8601310B2 (en) Partial memory mirroring and error containment
US8099570B2 (en) Methods, systems, and computer program products for dynamic selective memory mirroring
US8151138B2 (en) Redundant memory architecture management methods and systems
US10049004B2 (en) Electronic system with memory data protection mechanism and method of operation thereof
US10983921B2 (en) Input/output direct memory access during live memory relocation
CN114579340A (en) Memory error processing method and device
CN110955495B (en) Management method, device and storage medium of virtualized memory
CN102483713A (en) Reset method and monitor
US20120117445A1 (en) Data protection method for damaged memory cells
CN102968353A (en) Fail address processing method and fail address processing device
EP3698251B1 (en) Error recovery in non-volatile storage partitions
US7293138B1 (en) Method and apparatus for raid on memory
US9904567B2 (en) Limited hardware assisted dirty page logging
Kourai Fast and correct performance recovery of operating systems using a virtual machine monitor
EP2921965B1 (en) Information processing device and shared memory management method
WO2017078707A1 (en) Method and apparatus for recovering in-memory data processing system
US9836359B2 (en) Storage and control method of the same
US9594792B2 (en) Multiple processor system
US10915404B2 (en) Persistent memory cleaning
JP2017157098A (en) Information processing device, information processing method, and program
JP6682897B2 (en) Communication setting method, communication setting program, information processing apparatus, and information processing system
TW201418984A (en) Method for protecting data integrity of disk and computer program product for implementing the method
EP4099171A1 (en) Systems, methods, and apparatus for page migration in memory systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20161116