US6963934B2 - Hibernation of computer systems - Google Patents

Hibernation of computer systems Download PDF

Info

Publication number
US6963934B2
US6963934B2 US11/010,778 US1077804A US6963934B2 US 6963934 B2 US6963934 B2 US 6963934B2 US 1077804 A US1077804 A US 1077804A US 6963934 B2 US6963934 B2 US 6963934B2
Authority
US
United States
Prior art keywords
data
hibernation
memory
disk
dma
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime, expires
Application number
US11/010,778
Other versions
US20050108589A1 (en
Inventor
Andrew V. Kadatch
James E. Walsh
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Priority to US11/010,778 priority Critical patent/US6963934B2/en
Publication of US20050108589A1 publication Critical patent/US20050108589A1/en
Application granted granted Critical
Publication of US6963934B2 publication Critical patent/US6963934B2/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT CORPORATION
Adjusted expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1441Resetting or repowering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4418Suspend and resume; Hibernate and awake

Definitions

  • the present invention is generally directed to computer systems, and more particularly to preserving the state of computer systems when the computer is shut down.
  • Hibernation of computer systems generally refers to powering down a computer in a special way that preserves the state of the computer's configuration. Hibernation works by saving the volatile memory contents to a file on the computer's hard disk drive before shutting down. When power is restored, the file is read back into the memory and device states and services are restored, whereby the computer resumes the state it was in just before the hibernation, i.e., as if power was never turned off. Hibernation is especially popular for powering off mobile computers (e.g., notebooks and laptops) and other computers that are frequently turned on and off, because, unlike shutdown, hibernation automatically preserves opened documents, desktop settings, and other resources that have not been otherwise saved.
  • mobile computers e.g., notebooks and laptops
  • Hibernate is also preferable in many instances over a suspend mode that preserves the memory's contents by providing sufficient power to the RAM, because in suspend mode the RAM contents are lost if power is ever interrupted.
  • hibernation is thus a valued feature, it is relatively slow because of the large amounts of data (e.g., in contemporary systems the memory is typically at least sixty-four megabytes) that need to be transferred to and from the disk.
  • Hibernation takes on the order of twenty to forty seconds for entering hibernation and ten to fifteen seconds to resume from hibernation, depending on the model of the hard disk and the amount of installed memory. Many users are unsatisfied with the amount of time the hibernation process takes, and indeed, virtually every user of this feature would prefer that it was a faster process.
  • the present invention provides an improved hibernation method and system, including the use of a modified DMA (Direct Memory Access) mode of transferring data to (and from) the disk.
  • the modified DMA mode increases data transfer speed, while also freeing the system processor to perform other tasks, including compressing the data to reduce the size of the data that needs to be transferred to (and later from) the disk.
  • the present invention also provides an improved decompression mechanism that speeds resume time.
  • the present invention also provides a special mode that is a combination of a hibernation mode and a suspend mode. This mode essentially maintains power to the RAM while transparently backing the RAM with the hibernation file, such that if power to the RAM is interrupted, the RAM contents are automatically restored from the hibernation file when power is restored.
  • modified DMA for transferring data to and from disk is significantly faster than programmed input/output (PIO), and also frees the processor to perform other tasks while the hard disk controller transfers the data to and from the disk.
  • PIO programmed input/output
  • the present invention uses a special, modified DMA engine (including an asynchronous polling I/O protocol) that works by, among other things, polling a hard disk controller status register instead of relying on interrupts to signal the completed status. Between polling cycles, the compression or decompression tasks are performed by the processor.
  • the hibernation engine is able to use system software services (e.g., drivers) required by DMA without changing the contents of the memory to be preserved.
  • the hibernation engine communicates with the hard disk controller driver and DMA driver to obtain the hardware and software requirements, e.g., including, what is the largest size I/O request that the hardware handles, the amount of memory needed for the driver's internal structures, the region of memory where an I/O buffer should be located, and what the I/O buffer alignment should be. In this manner, the drivers do not access memory outside of that reserved for hibernation, and data integrity is preserved.
  • a two-phase LZ77 decoder includes a fast first phase that first guarantees sufficient room for decoding small symbol substrings and literal symbols, thus avoiding the need to check boundary limits before copying the literal symbol or substring as a whole. Because literal symbols and small substrings of symbols form the majority of compressed data, the reduced checking on average significantly speeds up decoding.
  • a second-phase, standard decoder is used that performs bounds checks on each symbol to ensure that the buffer does not overflow.
  • the standard decoder is normally used for decompressing only a small amount of data after a much larger amount of data has been decompressed with the fast decoder. In this manner, speed is greatly improved while not compromising safety that is needed in a reliable decoder.
  • FIG. 1 is a block diagram representing a computer system into which the present invention may be incorporated;
  • FIG. 2 is a block diagram generally representing exemplary components for preserving memory contents while entering hibernation in accordance with an aspect of the present invention
  • FIGS. 3 , 4 A and 4 B comprise a flow diagram generally representing steps taken to enter into hibernation in accordance with an aspect of the present invention
  • FIG. 5 is a block diagram generally representing exemplary components for restoring memory contents while resuming from hibernation in accordance with an aspect of the present invention
  • FIGS. 6 and 7 comprise a flow diagram representing general steps taken to resume from hibernation in accordance with an aspect of the present invention
  • FIG. 8 is a flow diagram generally representing a general two-phase decoding procedure for restoring compressed data to an uncompressed state in memory in accordance with an aspect of the present invention
  • FIGS. 9–11 comprise a flow diagram representing a fast decompression process in accordance with an aspect of the present invention.
  • FIG. 12 is a flow diagram generally representing a procedure for combining a suspend/sleep mode with a hibernation mode in accordance with an aspect of the present invention.
  • FIG. 1 illustrates an example of a suitable computing system environment 100 on which the invention may be implemented.
  • the computing system environment 100 is only one example of a suitable computing environment and is not intended to suggest any limitation as to the scope of use or functionality of the invention. Neither should the computing environment 100 be interpreted as having any dependency or requirement relating to any one or combination of components illustrated in the exemplary operating environment 100 .
  • the invention is operational with numerous other general purpose or special purpose computing system environments or configurations.
  • Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the invention include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
  • the invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer.
  • program modules include routines, programs, objects, components, data structures, and so forth, that perform particular tasks or implement particular abstract data types.
  • the invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located in both local and remote computer storage media including memory storage devices.
  • an exemplary system for implementing the invention includes a general purpose computing device in the form of a computer 110 .
  • Components of the computer 110 may include, but are not limited to, a processing unit 120 , a system memory 130 , and a system bus 121 that couples various system components including the system memory to the processing unit 120 .
  • the system bus 121 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures.
  • such architectures include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus also known as Mezzanine bus.
  • ISA Industry Standard Architecture
  • MCA Micro Channel Architecture
  • EISA Enhanced ISA
  • VESA Video Electronics Standards Association
  • PCI Peripheral Component Interconnect
  • the computer 110 typically includes a variety of computer-readable media.
  • Computer-readable media can be any available media that can be accessed by the computer 110 and includes both volatile and nonvolatile media, and removable and non-removable media.
  • Computer-readable media may comprise computer storage media and communication media.
  • Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by the computer 110 .
  • Communication media typically embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.
  • modulated data signal means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of the any of the above should also be included within the scope of computer-readable media.
  • the system memory 130 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 131 and random access memory (RAM) 132 .
  • ROM read only memory
  • RAM random access memory
  • BIOS basic input/output system
  • RAM 132 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 120 .
  • FIG. 1 illustrates operating system 134 , application programs 135 , other program modules 136 and program data 137 .
  • the computer 110 may also include other removable/non-removable, volatile/nonvolatile computer storage media.
  • FIG. 1 illustrates a hard disk drive 141 that reads from or writes to non-removable, nonvolatile magnetic media, a magnetic disk drive 151 that reads from or writes to a removable, nonvolatile magnetic disk 152 , and an optical disk drive 155 that reads from or writes to a removable, nonvolatile optical disk 156 such as a CD ROM or other optical media.
  • removable/non-removable, volatile/nonvolatile computer storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like.
  • the hard disk drive 141 is typically connected to the system bus 121 through a non-removable memory interface such as interface (e.g., hard disk controller) 140 , and magnetic disk drive 151 and optical disk drive 155 are typically connected to the system bus 121 by a removable memory interface, such as interface 150 .
  • hard disk drive 141 is illustrated as storing operating system 144 , application programs 145 , other program modules 146 and program data 147 . Note that these components can either be the same as or different from operating system 134 , application programs 135 , other program modules 136 , and program data 137 . Operating system 144 , application programs 145 , other program modules 146 , and program data 147 are given different numbers herein to illustrate that, at a minimum, they are different copies.
  • a user may enter commands and information into the computer 20 through input devices such as a keyboard 162 and pointing device 161 , commonly referred to as a mouse, trackball or touch pad.
  • Other input devices may include a microphone, joystick, game pad, satellite dish, scanner, or the like.
  • These and other input devices are often connected to the processing unit 120 through a user input interface 160 that is coupled to the system bus, but may be connected by other interface and bus structures, such as a parallel port, game port or a universal serial bus (USB).
  • a monitor 191 or other type of display device is also connected to the system bus 121 via an interface, such as a video interface 190 .
  • computers may also include other peripheral output devices such as speakers 197 and printer 196 , which may be connected through a output peripheral interface 190 .
  • the computer 110 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 180 .
  • the remote computer 180 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 110 , although only a memory storage device 181 has been illustrated in FIG. 1 .
  • the logical connections depicted in FIG. 1 include a local area network (LAN) 171 and a wide area network (WAN) 173 , but may also include other networks.
  • LAN local area network
  • WAN wide area network
  • Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets and the Internet.
  • the computer 110 When used in a LAN networking environment, the computer 110 is connected to the LAN 171 through a network interface or adapter 170 .
  • the computer 110 When used in a WAN networking environment, the computer 110 typically includes a modem 172 or other means for establishing communications over the WAN 173 , such as the Internet.
  • the modem 172 which may be internal or external, may be connected to the system bus 121 via the user input interface 160 or other appropriate mechanism.
  • program modules depicted relative to the computer 110 may be stored in the remote memory storage device.
  • FIG. 1 illustrates remote application programs 185 as residing on memory device 181 . It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.
  • the present invention is generally directed to improved hibernation in a number of ways, including by performing faster data transfer.
  • DMA direct memory access
  • the present invention has been implemented using selected drivers in the Windows® 2000 operating system. As will be understood, however, the present invention is not limited to any particular operating system and/or drivers or even a driver model, but rather is capable of being implemented by virtually any mechanism in any type of system.
  • DMA refers to a fast way of reading or writing data from memory to a peripheral device.
  • DMA operates essentially by providing a hard disk controller 140 ( FIG. 2 ) with the various parameters it needs to access the appropriate memory and hard disk 141 locations, and then letting the hard disk controller 140 asynchronously perform the read or write operation.
  • programmed input/output I/O is an alternative way to transfer data to and from a disk, but is much slower than DMA.
  • the system processor 120 is responsible for executing the instructions that transfer the data to and from the disk 141 .
  • Programmed I/O requires significant overhead, and moreover, requires work from the system processor 120 , which prevents the system processor 120 from doing other tasks while it performs the programmed I/O. Note that multithreaded operation is not allowed during hibernation, as the operating system components that control threading need to be effectively shut down to safely perform hibernation.
  • DMA and hard disk controller drivers require various amounts of memory to perform their operations, which in contemporary systems is virtual memory that is provided by mapping the virtual memory address to a physical memory location and swapping the existing physical memory contents to disk to preserve them for another process.
  • virtual memory that is provided by mapping the virtual memory address to a physical memory location and swapping the existing physical memory contents to disk to preserve them for another process.
  • swapping is also disabled during hibernation, to “freeze” the memory contents.
  • existing DMA controllers/drivers work with operating system services that also need to allocate and modify memory, which is forbidden during hibernation, and thus those services are not available for use with hibernation.
  • the hard disk controller 140 also works with drivers that use operating system services and need to allocate and modify memory.
  • a fast hibernation engine 200 ( FIG. 2 ) provides a controlled manner of using DMA that does not require interrupts and also does not allow memory modification, other than in a special safe section (or sections) of memory 202 allocated for hibernation. This frees the system processor 120 to perform other tasks while the system hardware (DMA controller 204 and hard disk controller 140 ) perform the requested data transfer operation. In particular, to further reduce the transfer times, the memory contents are compressed upon write to the hibernation file 206 , and decompressed upon read, whereby only a reduced amount of data needs to be transferred.
  • writing less data via parallel compression in accordance with the present invention has been shown to significantly reduce the time to enter hibernation, from approximately twenty to forty seconds without the present invention, to about seven to ten seconds with the present invention, (wherein the time ranges are caused by different models of hard disks and the amount of installed memory 132 ).
  • interrupts are disabled during hibernation.
  • DMA is possible without interrupts in most systems because most contemporary disk hardware (the controller 140 ) also writes the I/O request status to a status register 208 .
  • status registers are typically used only in PIO mode, respective hardware standards allow the use of the status register in the DMA mode.
  • the present invention implements a polling interface/protocol, whereby the hibernation engine 200 (via the system processor 120 ) initiates a DMA I/O operation, and then via the drivers periodically checks (polls) the contents of the status register 208 to obtain the status, that is, whether the I/O request is completed.
  • the terms “intermittently,” “regularly” “periodically” or the like with respect to the polling operation are equivalent, and only are intended to mean that the processor 120 is able to do other work in between asking for the status, rather than implying that any exact (or inexact) polling frequency is required.
  • the software drivers export an additional function that returns the state of the hardware to the hibernation engine 200 .
  • the hardware states include ready, (i.e., for a next I/O operation, which indicates any previous I/O success), busy (currently executing the I/O request), or failed (was unable to execute the I/O request).
  • the I/O status function should be called by the hibernation engine 200 frequently enough so that the disk is not unnecessarily idle, yet not so frequently that it interferes with the processor's ability to complete its parallel data compression or decompression tasks.
  • a compression engine 214 is called with special callback function that is invoked sufficiently often to check the status of pending asynchronous I/O operation and start a new I/O operation when the previous operation is complete.
  • the frequency of calling for disk status may be tuned for typical systems, and/or may be self-adjusting to adapt to various hardware differences, e.g., call less often while awaiting completion of a write if the disk is usually busy yet compression could use more processing time.
  • the drivers 210 , 212 are provided with a function for notifying them when a pending I/O operation is completed, so that a driver may take any appropriate actions required to finalize the I/O, as if an interrupt occurred.
  • the hard disk driver 212 and DMA driver 210 are controlled by the hibernation engine 200 to use only the safe hibernation locations of memory 202 .
  • the safe hibernation location 202 is specifically dedicated for use with the hibernation engine 200 , and comprises discardable pages that are not preserved during hibernation with the other system memory, and therefore may be modified. Note that the safe hibernation location 202 need not be physically contiguous (as generally represented in FIG. 2 ), but can be allocated in a plurality of locations, including specific physical regions, to accommodate the drivers' or hardware devices' needs.
  • Controlling the memory locations provided to the drivers 210 , 212 and complying with other requirements as described below allows existing, slightly-modified drivers to be used, instead of one or more specially-developed drivers (which although feasible, is significantly more work than is needed). Similarly, it is feasible to use two or more processors and/or special hardware, and/or processors that support multiple threads of execution, to further increase the speed of entering (and resuming from) hibernation.
  • the hard disk driver 212 e.g., ATAPI.SYS
  • the DMA controller driver 210 e.g., PCIIDEX.SYS
  • the hard disk driver 212 e.g., ATAPI.SYS
  • the DMA controller driver 210 e.g., PCIIDEX.SYS
  • the hard disk driver 212 e.g., ATAPI.SYS
  • the DMA controller driver 210 e.g., PCIIDEX.SYS
  • other DMA drivers and hard disk controllers e.g., SCSI-based, ATA100-based, or others can similarly provide limitations and details that will allow their memory usage to be controlled.
  • One such detail that is returned is whether the hard disk supports DMA and whether DMA is enabled, which if not, will result in the PIO mode being used.
  • the I/O will be performed in the old way, i.e., by the PIO mode with no attempt to do DMA.
  • the kernel fails to allocate I/O buffers satisfying the requirements of the hard disk driver 212 and DMA driver 210 as described below, hibernation will be performed via PIO.
  • Other information returned by the drivers 210 , 212 includes the amount of temporary memory required for storing any driver-specific data structures during operation. With this information, before beginning the data write operations, the hibernation engine 200 can request the operating system kernel to allocate an appropriate amount of memory for the respective drivers in the hibernation-safe area or areas 202 . This ensures that the drivers 210 , 212 will not need additional memory once the process of writing data for entering hibernation has begun, (because to freeze the non-hibernation memory contents for hibernation, allocation services and the like will not be allowed to operate).
  • the hibernation engine 200 first calls the drivers 210 , 212 to find out how much memory they need, then has an appropriate amount of memory allocated therefor at safe hibernation locations 202 , and gives each driver a respective pointer to its hibernation-safe memory location.
  • the drivers 210 , 212 receive the respective pointers to the safe hibernation location or locations 202 , the drivers 210 , 212 initialize their internal data structures at those locations. By allocating sufficient memory based on their specified needs, the device drivers 210 , 212 will have their data buffers stored in memory that is already allocated to them, so that no memory mapping/moving will be required when later I/O requests are processed.
  • the initialization phase is not different from a regular scenario, except for the initial providing of the drivers' requirements and the memory being allocated in the hibernation-safe memory location or locations 202 .
  • the extended driver functions also return the maximum size of allowed I/O requests, because hard disk and DMA controllers have a limit on a size of an I/O request that they may execute as a single atomic operation. Staying under the limit ensures that the drivers 210 , 212 will not need kernel services (which will not be available) to handle larger I/O requests.
  • Another detail returned is any maximum I/O buffer address, to handle some older hard disks and DMA controllers that cannot write to or read from memory above 16 MB, e.g., some types of physical memory require special mapping before DMA-based I/O operations in order to make such memory visible for the DMA controller.
  • the hibernation engine 200 can match the requirements and limitations of the hard disk controller 140 and DMA controller 204 so that DMA hibernation is possible without requiring complex kernel services during the I/O.
  • the hibernation engine 200 splits any large I/O requests into sequence of smaller ones that otherwise match the hardware limitations, DMA requests will not cause any additional operations with the I/O buffer or I/O request that may violate the memory-related rules of hibernation.
  • the DMA-based I/O functions use the hibernation-safe memory location supplied via the pointer from the calling hibernation engine 200 . Since there is no need for the use of complex kernel services for dequeuing pending I/O requests, no dequeuing is performed.
  • the hibernation engine starts the I/O operation by calling the DMA driver 210 , which in turn instructs the DMA controller 204 to start writing to the hard disk controller 140 , and instructs the hard disk controller 140 to write a data sequence from the DMA controller 204 to a specific location on the hard disk 141 .
  • the driver works by getting a new I/O (IO — START) request, including parameters for the new I/O request, including the size of the requested I/O operation, the target location on the hard disk, the address of the beginning of the I/O memory block, along with an address in the hibernation-safe memory region 202 for internal driver purposes.
  • the driver function verifies that each I/O request is compliant with the requirements imposed by the hardware it handles. If compliant, the function passes the request directly to lower-level routines without any transformations. If a request does not comply, the request is rejected and the driver returns STATUS — INVALID — IO — REQUEST.
  • any operations on the DMA and hard disk controllers are stopped, and either a STATUS — OK or STATUS — ERROR are returned by the driver 210 , respectively.
  • DMA-hibernation functions avoid using any system services or functions that send requests directly to respective hardware. If the I/O operation is not completed instantly and waiting is required, the driver 210 saves its state in its hibernation-safe memory location, and returns a STATUS — PENDING code to signal that I/O needs to be resumed later.
  • the driver functions instead of using interrupts to detect the end of the next stage of an I/O operation, the driver functions use a respective device status register 208 .
  • the status register 208 has a BUSY bit turned on, i.e., the device is still executing a request, and an ERROR bit turned off, i.e., there are no hardware errors
  • the driver 210 remembers its state in a hibernation-safe memory location supplied by the hibernation engine 200 , and returns a STATUS — PENDING. If an ERROR bit is turned on, it means that the requested I/O operation failed, and the driver 210 returns STATUS — ERROR.
  • the interface function should be called with an IO — RESUME instead of IO — START the next time, until the previous I/O operation completes. Otherwise, this stage of I/O completed successfully, whereby the driver 210 returns a STATUS — OK code so that caller may initiate a new I/O.
  • step 300 the hibernation engine 200 tests whether the hibernation-related functions are present in the hard disk controller driver and DMA driver. If not, step 300 branches to step 314 where the known PIO mode of entering hibernation is attempted. If at step 300 the hibernate functions are present, step 300 branches to step 302 , wherein the hibernation engine 200 calls the drivers' respective functions to obtain the requirements and limitations of the drivers and their respective hardware devices as discussed above.
  • Step 304 then evaluates the returned information to test whether DMA is supported by the particular hardware. If not, step 304 branches to step 314 where the known PIO mode of entering hibernation is instead attempted. Alternatively, if DMA is supported, step 304 branches to step 306 wherein the hibernation engine 200 calls the kernel services (note that they are still active at this time) to allocate memory for the drivers in the safe hibernation location in accordance with their requirements. Step 308 represents the requested allocation of memory for the compression buffer 218 , including any alignment or special physical locations as specified by the drivers.
  • Step 310 represents a test of whether the allocation requests were successful. Note that in actuality, each separate allocation request has a returned status that may indicate an error, however for simplicity these are collectively shown by decision diamond 310 . If there is an error (e.g., insufficient memory), step 310 branches to step 314 where the known PIO mode of entering hibernation is instead performed.
  • error e.g., insufficient memory
  • step 312 represents the passing of the pointers to the safe memory locations by the hibernation engine 200 to the drivers. Note that this can alternatively be provided later, such as in conjunction with the write request.
  • Step 314 represents the protecting of the memory regions that are to be preserved, by disabling interrupts and shutting down the system services. System services can be effectively shut down by gaining exclusive control of the processor or processors and thereafter not relinquishing control until after hibernation resume operations (described below) are complete. The process continues to step 400 of FIG. 4A to begin compressing and writing the compressed data to the hard disk 141 .
  • Step 400 represents the initializing of the buffer 218 in which compressed data will be placed prior to writing it, and step 402 tests whether the buffer is full, as discussed below, which initially is not the case.
  • the compression engine 214 is called at step 404 , with parameters providing the compression engine 214 with a pointer to the output buffer 218 and the starting location (e.g., selected page) of the system memory 216 to be preserved. Note that compression of the data is not necessary for hibernation, however compression significantly reduces data transfer times.
  • non-hibernation engine memory necessarily needs to be compressed/saved during hibernation, only those pages that are not marked as invalid or discardable.
  • memory pages may be deallocated, copied to disk for swapping purposes, and so forth, and thus there is no reason to preserve such unneeded or already-preserved pages.
  • the hibernation engine starts with a list of non-discardable physical (e.g., 4 KB) memory pages that should be saved, and (instead of advancing input pointer that points to memory), advances in the list of “pages-to-write” until the end of list is reached. Similarly, only those pages need to be restored. Thus, the system memory regions to preserve 216 may be considered only such selected pages.
  • any data compression engine is suitable, such as an LZ77-based compression engine that provides an acceptable compression ratio and compression speed.
  • LZ77 compression is a well-known data compression technique which generally operates by looking for matching bit or byte patterns in previous locations within a size-limited window, (e.g., 64 KB), and then replacing the matching pattern with a length, offset pairing that identifies the pattern. If no sufficient matching pattern is found, the pattern itself (one or more literals) is stored in the output compression buffer 218 . In general, the larger the window the better the compression ratio, but the longer it takes to search for matching symbol strings.
  • the compression engine 218 can be provided with data to compress and polled for completion, however it is also straightforward to arrange a compression engine 218 with a callback mechanism to call a function of the hibernation engine 200 after some number of bytes are compressed, where the number may be specified by a parameter provided by the calling hibernation engine 200 . In this manner, when compressing data, the pending I/O status can be regularly checked at a frequency controlled (primarily) by the hibernation engine 200 .
  • Step 404 represents the call by the hibernation engine 200 to the compression engine 214 , which compresses the data into the output compression buffer 218 . With the call, the compression engine 214 is passed a pointer to call back the hibernation engine 200 , so that when the amount of data is compressed the drivers can be again polled until the last pending I/O completes.
  • step 406 represents the checking of the disk status, which initially should be ready. If ready, step 410 is executed to notify the drivers 210 , 212 of any completed I/O, as needed. Step 412 checks whether there is enough compressed data to perform the write, e.g., based on hardware requirements and/or otherwise for efficiency, such as to write larger blocks per request rather than several smaller ones. If there is not enough data, the process loops back to step 404 to call back the compression engine 214 to compress more data. Note that with appropriate buffer sizes, it is not possible to have a situation wherein the buffer 218 is full yet there is not enough data to write.
  • step 414 When there is enough data to perform a write as determined via step 412 , step 414 then tests whether the data ready in the output buffer 218 exceeds the amount that the hardware can handle, which as discussed above was provided to the hibernation engine 200 via the calls to the drivers. If so, step 414 branches to step 416 to set the write request to the hardware limit, and thereby essentially segment this request so that the hardware drivers do not need the complex kernel services (now unavailable) to handle the request. Step 418 represents the DMA write request to the DMA controller driver 210 .
  • a circular output buffer that is properly aligned and has a size that is a multiple of the minimum may be used, (e.g., the maximum I/O request size is typically 4 KB, with a non-crossable boundary, typically equal to 64 KB).
  • Disk writes are in sizes that are a multiple of the minimum I/O request size (typically 512 bytes), but do not exceed the maximum I/O request size, and the I/O request address range does not cross a boundary.
  • the circular buffer comprises three possibly empty parts that are circularly contiguous, namely compressed data ready to write, unused, and compressed data being written.
  • step 420 while waiting for the write to complete, if there is more data to compress, step 420 loops back to step 402 to enable parallel data compression, provided that there is room in the output buffer 218 for the compressed data at step 402 .
  • the hibernation engine 200 controls the output buffer location into which the compression engine 214 will compress the source data, and the amount of data to compress, the hibernation engine 200 ensures that any already-compressed data will not be overwritten in the compression output buffer 218 until it has been successfully written to disk.
  • the hibernation engine 200 may thus wait as necessary for the next write to complete to free up enough space in the buffer 218 .
  • the hibernation engine 200 can assume a worst-case data compression scenario (e.g., that actually expands rather than compresses the data) when evaluating whether there is room in the output buffer 218 .
  • step 406 represents the subsequent polling calls by the hibernation engine 200 to have the driver check the status register 208 , as described above. While the write request is pending, step 406 loops back to step 402 to enable further data compression to occur in parallel with the write, until the status changes to ready or failed. If the DMA I/O write operation fails, (the driver returned STATUS — ERROR), at step 408 the hibernation engine 200 may retry the request some number of times, (e.g., once more), and if still failing will try to switch to PIO mode and continue operation in PIO mode.
  • the hibernation engine 200 notifies the drivers 210 , 212 at step 410 , and proceeds via steps 414 – 420 to repeat the process and start the next write.
  • step 430 of FIG. 4B represents polling the status for the previous write request to complete, again handling any failure via step 432 .
  • step 434 calls the drivers 210 , 212 to notify them of the success.
  • Step 436 then tests whether there is more data in the compression output buffer 218 to write. If not, the memory contents have been saved to the hibernation file 206 and the process continues to step 448 , described below. If there is more data in the output buffer 218 to write, step 438 tests whether there is enough to write, and if not, branches to step 440 to fill an otherwise inadequate write request with blank data or the like if needed to meet a hardware minimum.
  • step 438 branches to step 442 to test the amount of data that can be written against the hardware limit. If too large, step 442 branches to step 444 to set the write request to the hardware limit, and thereby essentially segment this request.
  • Step 446 represents the DMA write request to the DMA controller driver 210 . Following the write request, the process returns to step 430 to await the write status as described above.
  • step 448 represents the marking of the hibernation file 206 as valid (tested upon resume, as described below), the closing of the hibernation file 206 , and the outputting of the instructions (e.g., to various registers) that put the machine hardware into the proper mode so as to awake when one or more various criteria are satisfied.
  • hibernation can be rapidly achieved via parallel data compression and DMA disk writing.
  • Resuming from hibernation can proceed as before, e.g., via PIO.
  • the present invention also provides two significant improvements to the resume from hibernation operation, which can be combined or used separately to resume much faster than with conventional PIO and decompression models.
  • a first improvement described below, deals with the use of DMA to read the hibernation file 206 and thereby allow parallel decompression to occur.
  • a second improvement also described below, provides a significantly faster decompression process than those previously used.
  • FIG. 5 and the flow diagrams of FIGS. 6 and 7 represent the resume components and process when performed with DMA.
  • a system loader process 500 has been modified to check the hibernation file 206 , and if marked valid, read in (e.g., via PIO) the components needed to perform DMA, e.g., the hibernation engine 200 (or a resume equivalent) along with the DMA controller driver 210 and hard disk driver 212 .
  • PIO reading no DMA is presently (and exclusively) performed during resume from hibernation, and thus will not be described in detail herein.
  • the initial system loader 500 first checks the status of the hibernation file 206 as represented in FIG. 6 via step 600 . If not valid, via step 602 the system will boot normally, e.g., without attempting to restore any hibernated state, since this normally means that the user did not enter into hibernation when the machine was powered down. Another possibility is that the hibernation file 206 is marked as being in a “Resuming” state. As described below, this means that the hibernation resume was attempted, but failed for some reason.
  • step 600 determines that the hibernation file 206 is valid, and step 604 marks the file as “Resuming,” (e.g., via a PIO write), whereby if the resume process fails during memory restoration the user will be prompted to retry the resume or direct the machine to boot normally. This gives the user control rather than forever attempting to resume from a failed hibernation. Note that later, if the resume operation was successful, the hibernation file 206 will be marked as invalid.
  • Step 606 represents the reading in of the hibernation engine 200 , a decompression engine 502 , and the drivers that enable DMA.
  • the system loader can read this into a hibernation resume safe memory region 504 , or once loaded the hibernation engine 200 can move itself and the other components around as needed.
  • the system loader 500 calls the hibernation engine 200 as represented by step 608 .
  • the system loader 500 (or the hibernation engine 200 ) can load the decompression engine 502 based on the type of compression that was used, if known in advance or is otherwise determinable, (e.g., via metadata associated with the hibernation file 206 ). For example, as compression technology has evolved, different decompression engines are needed, and the present invention can selectively load a matching one so that new technologies can be used, yet backward compatibility is retained.
  • the hibernation engine 200 initializes one or more decompression input buffers 506 to prepare for reading and decompressing of the hibernation file 206 .
  • the hibernation engine 200 then proceeds to step 700 of FIG. 7 to call the DMA controller driver 210 to read in the data for decompression and restoration thereof via DMA read request, passing a pointer to the input buffer 506 .
  • the hibernation engine 200 uses a logically circular input buffer 506 or the like that is sufficiently large with respect to the original compression window so that a read can occur into one location while another location is being decompressed without the possibility that data (a pattern) that is needed for decompression will be overwritten by the next read.
  • the buffer can be safely filled in one buffer location while the data in another buffer location is being decompressed.
  • the read size may be adjusted for segmenting if a logically circular buffer is being used so as to not read beyond the physical end of the buffer.
  • Step 702 checks whether the buffer 506 is ready with a sufficient amount of data to start decompression, because, for example, there may not be any data in it, or because the reads may have to be segmented to match hardware limitations, and it may be more efficient to decompress data in large amounts per call rather than regularly call the decompression engine with only small amounts. If more data is needed, step 702 bypasses calling the decompression engine, otherwise step 704 is executed to call the decompression engine 502 to decompress some amount of data.
  • the decompression engine 502 may be similar to the compression engine as described above with respect to its called functions, e.g., passed a pointer to the appropriate location in the input buffer 506 , an amount (e.g., number of compressed bytes) to decompress, and a callback pointer for calling the hibernation engine 200 when finished.
  • the decompression engine 502 is given a pointer to an output buffer, which may be a temporary restore buffer or directly at a location (e.g., in a selected page) in the system memory 216 that is being restored.
  • Step 706 represents the polling for the DMA status, (similar to that described above with respect to the write request), with step 708 representing the handling of the failure case. If a failure occurred, one or more DMA or PIO retries may be made, which if also unsuccessful will cause a restart of the system, this time with the hibernation file 206 marked as “Resuming” status. If the status is pending (busy), the process branches back to step 702 to attempt to decompress more data. If ready, the process continues to step 710 to notify the drivers.
  • Step 712 tests for whether the hibernation file 206 has been fully read into the input buffer 506 . If not, step 712 branches to step 714 to determine whether there is adequate room in the input buffer 506 for further data to be read. If there is not enough room, step 714 branches to step 704 to decompress some more data. If there is room, step 714 returns to step 700 to read in another amount. Note that in addition to detecting the end of the file 206 , the system loader/hibernation engine 200 can also detect out of memory situations, such as when memory was physically removed since hibernation was entered or the hibernation file 206 “appears” on a different machine (because the hard drive was transferred) without sufficient memory. For example, in situations when it turns out that there is more data to decompress than memory for outputting it, the hibernation file 206 can be marked invalid and the system restarted.
  • steps 716 , 718 and 720 are executed to complete the resume from hibernation process, including decompressing the remaining data in the input buffer 506 , marking the hibernation file 206 as invalid (so that it will not be used again unless hibernation is explicitly re-entered), and to exit hibernation.
  • Exiting hibernation essentially comprises deallocating as much of the hibernation resume safe memory 504 as possible and relinquishing control at the hibernation engine 200 to the processes that were executing when hibernation was entered.
  • the system memory 216 e.g., each of the selected pages
  • the state of devices, services and so forth are restored, and the computer operates as it was before hibernation was entered.
  • a significantly faster LZ77 decompression decoder is provided that can be used with data decompression generally, including when resuming from hibernation (whether PIO resume or DMA-based resume).
  • the faster decoder operates by reducing the number of bounds checks on the buffers without allowing the output buffer to overflow.
  • a reliable decompression decoder needs to handle not only valid and correctly encoded data, but also corrupted data. If encoded data is corrupted, a reliable decoder either detects or reports an error, or, if an error cannot be detected easily, continues execution without crashing or causing system instability. Note that the use of checksums and the like detect data corruption only probabilistically, and thus are not guaranteed to be perfect, as there is always a chance that a checksum will miss data corruption.
  • an LZ77 decoder has to ensure that the input buffer has enough data, that each substring referenced by a pointer in LZ77 code actually exists, and that the output buffer has enough space to hold decoded symbols. If any of these rules are not enforced, corrupted data may cause the decoder to access memory outside of its input/output buffers, causing a memory protection fault or overwriting other key data, thus crashing or destabilizing the system. Much of the processor time used by known decoders is spent checking the bounds to ensure that decoding operations stay within the buffers.
  • FIGS. 8–11 are flow diagrams that explain how the fast decoder (e.g., 508 ) of the present invention substantially increases the decompression speed by reducing the number of bounds checks for the majority of the data to be decompressed, yet does so safely.
  • the remaining data that cannot be safely decompressed with the fast decoder 508 is decompressed with a standard decoder 510 .
  • both the standard decoder 510 and fast decoder 508 are relatively small, and thus it is not a significant burden on a contemporary system to have two types of decoders loaded into memory at the same time.
  • the fast and standard decoders can be considered a two-phase decoder, and can be incorporated into the decompression engine 502 of FIG. 5 .
  • the first phase (fast) decoder 508 benefits from common statistical properties of typical encoded data (texts, documents, executable programs, object files, libraries, HTML/XML pages, database files, and so forth) in order to considerably reduce amount of bounds checks needed on average.
  • typical encoded data texts, documents, executable programs, object files, libraries, HTML/XML pages, database files, and so forth
  • One feature that the present invention leverages is that in typical data, most matching substrings are relatively short. Indeed, in typical encoded data, approximately ninety-five percent of matching substrings do not exceed eight symbols in length, while the average length of a matching substring is approximately five symbols.
  • the fast decoder 508 of the present invention more efficiently handles the decoding of these short substrings and literal symbols by eliminating the output buffer bounds checks for them and replacing the conventional symbol-by-symbol copying (that checks for the end of the substring after each copied symbol) with a technique that unconditionally copies a fixed amount of symbols.
  • the fast decoder 508 guarantees that the output buffer will have enough space left to hold short substrings and literal symbols. When the buffer is almost full and this guarantee is not possible, the decompression process switches to the second decoding phase that finishes the decoding using a standard decoder 510 .
  • the fast decoder 508 guarantees that it has enough space in the output buffer for some number N of symbols, chosen herein to be space for eight (8) more symbols. The choice of the actual value may vary depending on a particular implementation and hardware.
  • FIG. 8 shows a general process that sets up and then calls the fast decoder 508 in a first phase, and the standard decoder 510 in a second phase, to decode some amount of data.
  • Step 800 represents the initialization of the input and output buffers.
  • Step 804 then calls the fast decoder, beginning at step 900 of FIG. 9 .
  • the guarantee is accomplished by periodically adjusting an input buffer bound limit.
  • N the number of symbols that are decoded by the second phase.
  • FIGS. 9–11 shows the process of the fast decoder 508 , beginning at step 900 wherein a safety check is made that the actual input position is not greater than or equal to the maximum safe value (first calculated at step 802 ). At this time, the input position is zero, so step 900 branches to step 1000 of FIG. 10 .
  • Step 1000 of FIG. 10 selects as a token the next element (a literal symbol or length, offset pairing) from the input buffer (e.g., 506 ), and then at step 1002 moves the input position based on the size of the token.
  • Step 1006 checks whether the token is a literal symbol. If so, the token is copied to the output buffer at step 1008 , without any bounds check. Note that during the first phase of decoding, it is guaranteed that the output buffer always has enough space to store at least N more symbols. This is done without increasing the number of comparisons, because before decoding, the next token decoding operation ensures that the input buffer limit is not crossed.
  • Step 1010 adjusts the position in output buffer to prepare for the next symbol.
  • step 1006 ranches to step 1012 to obtain the length and offset values from the token.
  • Step 1014 is a check that ensures that the offset is not greater than the output position. Note that in LZ77 compression, any given offset should point to a previous location in the already decoded data in the output buffer. If not, the compressed data is corrupted, and step 1014 branches to step 1016 to report this and return to FIG. 8 to halt the decompression.
  • the first phase decompression engine processes substrings of N or less symbols and longer substrings differently. If the length of a substring is N or less, then the decompression engine copies N (e.g., eight) symbols into the output buffer (which is guaranteed to have enough space to hold N more symbols), and then adjusts the output position (pointer) by the actual length (e.g., five) of the substring. Note that since N is small, copying of the few extra symbols (which will then be overwritten) is normally faster than copying the substring symbol by symbol in a loop while comparing the number of symbols copied so far against the actual amount needed.
  • N e.g. eight
  • step 1024 is executed to determine whether there is another token to decode. If not, the fast decoding is done and step 1026 sets the decode status to “OK” or the like and returns to FIG. 8 . Otherwise, the process returns to FIG. 9 , step 900 .
  • step 904 If at step 904 this number is zero, then the output position is too close to the end of the output buffer to use the first phase recognition engine, and the decode status is set to OK and the process returns to FIG. 8 to decode any remaining data via the standard decoder 508 .
  • step 908 sets a new maximum safe input position equal to the actual input position plus the number of safe tokens. If at step 910 the maximum safe input position is not exceeded by a maximum value allowed for it, the process branches to FIG. 10 to decode another literal or substring as described above. If however, the maximum safe input position is exceeded by the maximum value, step 912 sets it to the maximum value. Step 914 checks whether the actual input position exceeds this maximum value. If so, the data is corrupt (step 916 ). If not, the process branches to FIG. 10 to decode another literal or substring as described above.
  • FIG. 11 represents the steps generally taken when a substring is greater than eight symbols in length, beginning at step 1100 wherein a last output position variable is set to the output position plus the length.
  • Step 1102 checks whether the last output position is greater than the maximum output position allowed. If so, the data is corrupt, and step 1102 branches to step 1110 to set the status to corrupt and return to FIG. 8 . Normally this is not the case, and step 1102 branches to step 1104 .
  • the decompression engine compares the length with a MAX — LENGTH escape code value, which, in general, is a special code indicating that the length is greater than a more simply represented length value, and the actual length has to be decoded from other bits. If so, step 1106 is executed. In step 1106 , the decompression engine efficiently uses a sufficiently large substring by copying MAX — LENGTH symbols to the output buffer, until the number of uncopied symbols remaining is not less than MAX — LENGTH.
  • a MAX — LENGTH escape code value which, in general, is a special code indicating that the length is greater than a more simply represented length value, and the actual length has to be decoded from other bits. If so, step 1106 is executed. In step 1106 , the decompression engine efficiently uses a sufficiently large substring by copying MAX — LENGTH symbols to the output buffer, until the number of uncopied symbols remaining is not less than MAX — LENGTH.
  • the output position counter is incremented by the MAX — LENGTH value, and a counter of symbols left to copy is decremented by MAX — LENGTH. Any remaining (MAX — LENGTH ⁇ 1) or less symbols are copied one by one via steps 1112 , 1114 , 1116 and 1118 , wherein the last output position value represents where the output position should be when the last symbol is to be decoded.
  • the decompression engine guarantees that the copying of substring of L symbols will require no more than (Length/MAX — LENGTH+MAX — LENGTH ⁇ 1) comparisons—effectively reducing number of comparisons in almost MAX — LENGTH times for a sufficiently large Length value.
  • Steps 1108 and 1110 handle the situation of not enough data in the input buffer for a given length, which is corrupted data.
  • the process returns to FIG. 9 as described above.
  • the use of the fast two-phase LZ77 decoder implementing the present invention improved actual decoding speeds relative to a conventional decoder by about twenty to forty percent (depending on the actual compressed data and hardware used), primarily by reducing the number of comparisons (by about a factor of two).
  • FIG. 12 shows a way in which this can be accomplished, for example, beginning at step 1200 when a user “shuts down” a machine, with an option or the like that puts the machine into a suspend or sleep mode (sometimes referred to as an “S 4 ” state).
  • This option may be user-selectable (e.g., “suspend with hibernation” versus normal “suspend”) or automatic, e.g., suspend is backed by hibernation without the user being explicitly aware of it.
  • a suspend/sleep state may be generally accomplished by providing sufficient power to the volatile memory to maintain its contents and shutting down the rest of the hardware devices.
  • Hibernation that backs the suspend state may be further accomplished by removing power to the devices not needed for hibernation (step 1202 ) while temporarily maintaining power to the hard disk (controller and drive), and preferably the DMA controller. Note that this gives the appearance of instantaneously entering suspend, e.g., the display goes dark, speakers go silent and so forth, (although a perceptive user may notice the hard drive operating for a while longer).
  • the remaining powered-up hardware is then used (e.g., via the DMA process as described above or via a PIO process) to automatically write the memory contents to a valid hibernation file at step 1204 .
  • the hibernation file is written and marked valid, the hard disk hardware and DMA controller are powered off, as represented in FIG. 12 via step 1206 .
  • the system is now in a suspended mode as previously understood, but with a valid hibernation file backing the memory contents as if hibernating.
  • Step 1208 represents the user activating the machine. Because the system is suspended/asleep, if there was no power outage and the memory is still intact at step 1210 , the system may instantly resume from its sleep/suspend state, and marks (e.g., via the hardware/firmware that enables resuming from suspend) the hibernation image as “invalid” since it is not needed (step 1212 ). However, if a power outage occurred and the user turns the machine back on, the hardware/firmware will load the system loader process (step 1214 ), which in turn will see a valid hibernation file (step 1216 ) and resume from hibernation as described above. At step 1218 , the hibernation file is marked as invalid by the hibernation engine when the resume is successful, as also described above.
  • hibernation for computer systems. Entering and resuming from hibernation is significantly faster by DMA and/a faster decoder. Hibernation may be combined with suspend/sleep modes to provide near instantaneous state preservation yet with the resistance to power outages provided by hibernation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Quality & Reliability (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

An improved hibernation method and system, including the use of a modified DMA (Direct Memory Access) mode of transferring data to and from the disk. The use of DMA increases data transfer speed, while freeing the system processor to perform other tasks, including compressing/decompressing the data transferred to and from the disk. An improved decoder is also provided that reduces the number of bounds checks needed on average for typical compressed data by first guaranteeing that there is sufficient room to decode literals and small substrings, whereby bounds checking is not needed. A combination hibernation mode and a suspend mode is also provided that essentially maintains power to the RAM while transparently backing the RAM with the hibernation file, such that if power to the RAM is interrupted, the RAM contents are automatically restored from the hibernation file when power is restored.

Description

CROSS-REFERENCE TO RELATED APPLICATION
This is a continuation of U.S. patent application Ser. No. 09/814,474 filed Mar 21, 2001.
FIELD OF THE INVENTION
The present invention is generally directed to computer systems, and more particularly to preserving the state of computer systems when the computer is shut down.
BACKGROUND OF THE INVENTION
Hibernation of computer systems generally refers to powering down a computer in a special way that preserves the state of the computer's configuration. Hibernation works by saving the volatile memory contents to a file on the computer's hard disk drive before shutting down. When power is restored, the file is read back into the memory and device states and services are restored, whereby the computer resumes the state it was in just before the hibernation, i.e., as if power was never turned off. Hibernation is especially popular for powering off mobile computers (e.g., notebooks and laptops) and other computers that are frequently turned on and off, because, unlike shutdown, hibernation automatically preserves opened documents, desktop settings, and other resources that have not been otherwise saved. In addition, a resume from hibernated state is often much faster than a full system startup. Hibernate is also preferable in many instances over a suspend mode that preserves the memory's contents by providing sufficient power to the RAM, because in suspend mode the RAM contents are lost if power is ever interrupted.
While hibernation is thus a valued feature, it is relatively slow because of the large amounts of data (e.g., in contemporary systems the memory is typically at least sixty-four megabytes) that need to be transferred to and from the disk. Hibernation takes on the order of twenty to forty seconds for entering hibernation and ten to fifteen seconds to resume from hibernation, depending on the model of the hard disk and the amount of installed memory. Many users are unsatisfied with the amount of time the hibernation process takes, and indeed, virtually every user of this feature would prefer that it was a faster process.
SUMMARY OF THE INVENTION
Briefly, the present invention provides an improved hibernation method and system, including the use of a modified DMA (Direct Memory Access) mode of transferring data to (and from) the disk. The modified DMA mode increases data transfer speed, while also freeing the system processor to perform other tasks, including compressing the data to reduce the size of the data that needs to be transferred to (and later from) the disk. For data that is compressed, the present invention also provides an improved decompression mechanism that speeds resume time. The present invention also provides a special mode that is a combination of a hibernation mode and a suspend mode. This mode essentially maintains power to the RAM while transparently backing the RAM with the hibernation file, such that if power to the RAM is interrupted, the RAM contents are automatically restored from the hibernation file when power is restored.
The use of modified DMA for transferring data to and from disk is significantly faster than programmed input/output (PIO), and also frees the processor to perform other tasks while the hard disk controller transfers the data to and from the disk. To accomplish hibernation with DMA while not changing the contents of the memory that are to be preserved, the present invention uses a special, modified DMA engine (including an asynchronous polling I/O protocol) that works by, among other things, polling a hard disk controller status register instead of relying on interrupts to signal the completed status. Between polling cycles, the compression or decompression tasks are performed by the processor. Moreover, the hibernation engine is able to use system software services (e.g., drivers) required by DMA without changing the contents of the memory to be preserved. To this end, memory devoted for internal driver data structures is allocated in a hibernation-safe memory location or locations, I/O buffer locations are moved into the safe memory range, buffer alignment is controlled, and large I/O requests are split into sequences of smaller ones. More particularly, the hibernation engine communicates with the hard disk controller driver and DMA driver to obtain the hardware and software requirements, e.g., including, what is the largest size I/O request that the hardware handles, the amount of memory needed for the driver's internal structures, the region of memory where an I/O buffer should be located, and what the I/O buffer alignment should be. In this manner, the drivers do not access memory outside of that reserved for hibernation, and data integrity is preserved.
The resume (from hibernation) process has also been sped up by an improved decoding process, that eliminates a substantial amount of the bounds checking normally necessary with decompression buffers, yet operates safely within the buffers. To this end, a two-phase LZ77 decoder includes a fast first phase that first guarantees sufficient room for decoding small symbol substrings and literal symbols, thus avoiding the need to check boundary limits before copying the literal symbol or substring as a whole. Because literal symbols and small substrings of symbols form the majority of compressed data, the reduced checking on average significantly speeds up decoding. When the end of the output buffer is neared, a second-phase, standard decoder is used that performs bounds checks on each symbol to ensure that the buffer does not overflow. However, the standard decoder is normally used for decompressing only a small amount of data after a much larger amount of data has been decompressed with the fast decoder. In this manner, speed is greatly improved while not compromising safety that is needed in a reliable decoder.
Other objects and advantages will become apparent from the following detailed description when taken in conjunction with the drawings, in which:
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram representing a computer system into which the present invention may be incorporated;
FIG. 2 is a block diagram generally representing exemplary components for preserving memory contents while entering hibernation in accordance with an aspect of the present invention;
FIGS. 3, 4A and 4B comprise a flow diagram generally representing steps taken to enter into hibernation in accordance with an aspect of the present invention;
FIG. 5 is a block diagram generally representing exemplary components for restoring memory contents while resuming from hibernation in accordance with an aspect of the present invention;
FIGS. 6 and 7 comprise a flow diagram representing general steps taken to resume from hibernation in accordance with an aspect of the present invention;
FIG. 8 is a flow diagram generally representing a general two-phase decoding procedure for restoring compressed data to an uncompressed state in memory in accordance with an aspect of the present invention;
FIGS. 9–11 comprise a flow diagram representing a fast decompression process in accordance with an aspect of the present invention; and
FIG. 12 is a flow diagram generally representing a procedure for combining a suspend/sleep mode with a hibernation mode in accordance with an aspect of the present invention.
DETAILED DESCRIPTION
Exemplary Operating Environment
FIG. 1 illustrates an example of a suitable computing system environment 100 on which the invention may be implemented. The computing system environment 100 is only one example of a suitable computing environment and is not intended to suggest any limitation as to the scope of use or functionality of the invention. Neither should the computing environment 100 be interpreted as having any dependency or requirement relating to any one or combination of components illustrated in the exemplary operating environment 100.
The invention is operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the invention include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, and so forth, that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
With reference to FIG. 1, an exemplary system for implementing the invention includes a general purpose computing device in the form of a computer 110. Components of the computer 110 may include, but are not limited to, a processing unit 120, a system memory 130, and a system bus 121 that couples various system components including the system memory to the processing unit 120. The system bus 121 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus also known as Mezzanine bus.
The computer 110 typically includes a variety of computer-readable media. Computer-readable media can be any available media that can be accessed by the computer 110 and includes both volatile and nonvolatile media, and removable and non-removable media. By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by the computer 110. Communication media typically embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of the any of the above should also be included within the scope of computer-readable media.
The system memory 130 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 131 and random access memory (RAM) 132. A basic input/output system 133 (BIOS), containing the basic routines that help to transfer information between elements within computer 110, such as during start-up, is typically stored in ROM 131. RAM 132 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 120. By way of example, and not limitation, FIG. 1 illustrates operating system 134, application programs 135, other program modules 136 and program data 137.
The computer 110 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only, FIG. 1 illustrates a hard disk drive 141 that reads from or writes to non-removable, nonvolatile magnetic media, a magnetic disk drive 151 that reads from or writes to a removable, nonvolatile magnetic disk 152, and an optical disk drive 155 that reads from or writes to a removable, nonvolatile optical disk 156 such as a CD ROM or other optical media. Other removable/non-removable, volatile/nonvolatile computer storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like. The hard disk drive 141 is typically connected to the system bus 121 through a non-removable memory interface such as interface (e.g., hard disk controller) 140, and magnetic disk drive 151 and optical disk drive 155 are typically connected to the system bus 121 by a removable memory interface, such as interface 150.
The drives and their associated computer storage media, discussed above and illustrated in FIG. 1, provide storage of computer-readable instructions, data structures, program modules and other data for the computer 110. In FIG. 1, for example, hard disk drive 141 is illustrated as storing operating system 144, application programs 145, other program modules 146 and program data 147. Note that these components can either be the same as or different from operating system 134, application programs 135, other program modules 136, and program data 137. Operating system 144, application programs 145, other program modules 146, and program data 147 are given different numbers herein to illustrate that, at a minimum, they are different copies. A user may enter commands and information into the computer 20 through input devices such as a keyboard 162 and pointing device 161, commonly referred to as a mouse, trackball or touch pad. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, or the like. These and other input devices are often connected to the processing unit 120 through a user input interface 160 that is coupled to the system bus, but may be connected by other interface and bus structures, such as a parallel port, game port or a universal serial bus (USB). A monitor 191 or other type of display device is also connected to the system bus 121 via an interface, such as a video interface 190. In addition to the monitor, computers may also include other peripheral output devices such as speakers 197 and printer 196, which may be connected through a output peripheral interface 190.
The computer 110 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 180. The remote computer 180 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 110, although only a memory storage device 181 has been illustrated in FIG. 1. The logical connections depicted in FIG. 1 include a local area network (LAN) 171 and a wide area network (WAN) 173, but may also include other networks. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets and the Internet. When used in a LAN networking environment, the computer 110 is connected to the LAN 171 through a network interface or adapter 170. When used in a WAN networking environment, the computer 110 typically includes a modem 172 or other means for establishing communications over the WAN 173, such as the Internet. The modem 172, which may be internal or external, may be connected to the system bus 121 via the user input interface 160 or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 110, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation, FIG. 1 illustrates remote application programs 185 as residing on memory device 181. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.
Improved Hibernation
The present invention is generally directed to improved hibernation in a number of ways, including by performing faster data transfer. To this end, DMA (direct memory access) may be used when transferring data to and from a hard disk. To use DMA while benefiting from already existing components, the present invention has been implemented using selected drivers in the Windows® 2000 operating system. As will be understood, however, the present invention is not limited to any particular operating system and/or drivers or even a driver model, but rather is capable of being implemented by virtually any mechanism in any type of system.
In general, DMA refers to a fast way of reading or writing data from memory to a peripheral device. With a disk, DMA operates essentially by providing a hard disk controller 140 (FIG. 2) with the various parameters it needs to access the appropriate memory and hard disk 141 locations, and then letting the hard disk controller 140 asynchronously perform the read or write operation. In contrast, programmed input/output (I/O) is an alternative way to transfer data to and from a disk, but is much slower than DMA. In programmed I/O, the system processor 120 is responsible for executing the instructions that transfer the data to and from the disk 141.
Programmed I/O requires significant overhead, and moreover, requires work from the system processor 120, which prevents the system processor 120 from doing other tasks while it performs the programmed I/O. Note that multithreaded operation is not allowed during hibernation, as the operating system components that control threading need to be effectively shut down to safely perform hibernation.
Despite its drawbacks, programmed I/O was heretofore used exclusively for entering and resuming from hibernation. This is because it was not believed to be possible to use DMA with hibernation for a number of reasons that are essentially related to preserving the exact state of the system memory, i.e., the memory contents cannot be changing once hibernation has started (except for special working location or locations allocated to hibernation). More particularly, one reason that it was not believed possible to use DMA during hibernation is that interrupts need to be disabled to prevent any modification to the system memory. However, DMA/the hard disk controller 140 use interrupts to asynchronously notify the system processor of the status (e.g., completed or failed) of requested input-output operations. Another reason why DMA is not implemented with hibernation is that DMA and hard disk controller drivers require various amounts of memory to perform their operations, which in contemporary systems is virtual memory that is provided by mapping the virtual memory address to a physical memory location and swapping the existing physical memory contents to disk to preserve them for another process. However, since memory cannot be changed during hibernation, swapping is also disabled during hibernation, to “freeze” the memory contents. Similarly, existing DMA controllers/drivers work with operating system services that also need to allocate and modify memory, which is forbidden during hibernation, and thus those services are not available for use with hibernation. The hard disk controller 140 also works with drivers that use operating system services and need to allocate and modify memory.
In accordance with one aspect of the present invention, a fast hibernation engine 200 (FIG. 2) provides a controlled manner of using DMA that does not require interrupts and also does not allow memory modification, other than in a special safe section (or sections) of memory 202 allocated for hibernation. This frees the system processor 120 to perform other tasks while the system hardware (DMA controller 204 and hard disk controller 140) perform the requested data transfer operation. In particular, to further reduce the transfer times, the memory contents are compressed upon write to the hibernation file 206, and decompressed upon read, whereby only a reduced amount of data needs to be transferred. For example, writing less data via parallel compression in accordance with the present invention has been shown to significantly reduce the time to enter hibernation, from approximately twenty to forty seconds without the present invention, to about seven to ten seconds with the present invention, (wherein the time ranges are caused by different models of hard disks and the amount of installed memory 132).
To provide DMA in conjunction with preventing the memory from changing via interrupts, interrupts are disabled during hibernation. However, DMA is possible without interrupts in most systems because most contemporary disk hardware (the controller 140) also writes the I/O request status to a status register 208. Although such status registers are typically used only in PIO mode, respective hardware standards allow the use of the status register in the DMA mode.
To use the status register 208 instead of the normal interrupt handling, the present invention implements a polling interface/protocol, whereby the hibernation engine 200 (via the system processor 120) initiates a DMA I/O operation, and then via the drivers periodically checks (polls) the contents of the status register 208 to obtain the status, that is, whether the I/O request is completed. Note that as used herein, the terms “intermittently,” “regularly” “periodically” or the like with respect to the polling operation are equivalent, and only are intended to mean that the processor 120 is able to do other work in between asking for the status, rather than implying that any exact (or inexact) polling frequency is required.
In order to implement the polling protocol, the software drivers (DMA controller driver 210 and hard disk controller driver 212) export an additional function that returns the state of the hardware to the hibernation engine 200. The hardware states include ready, (i.e., for a next I/O operation, which indicates any previous I/O success), busy (currently executing the I/O request), or failed (was unable to execute the I/O request). When transferring data, the I/O status function should be called by the hibernation engine 200 frequently enough so that the disk is not unnecessarily idle, yet not so frequently that it interferes with the processor's ability to complete its parallel data compression or decompression tasks. To this end, a compression engine 214 is called with special callback function that is invoked sufficiently often to check the status of pending asynchronous I/O operation and start a new I/O operation when the previous operation is complete. Note, for example, that the frequency of calling for disk status may be tuned for typical systems, and/or may be self-adjusting to adapt to various hardware differences, e.g., call less often while awaiting completion of a write if the disk is usually busy yet compression could use more processing time. The drivers 210, 212 are provided with a function for notifying them when a pending I/O operation is completed, so that a driver may take any appropriate actions required to finalize the I/O, as if an interrupt occurred.
In order to avoid modification of the memory region 216 that is to be written to disk by other means, the hard disk driver 212 and DMA driver 210 are controlled by the hibernation engine 200 to use only the safe hibernation locations of memory 202. The safe hibernation location 202 is specifically dedicated for use with the hibernation engine 200, and comprises discardable pages that are not preserved during hibernation with the other system memory, and therefore may be modified. Note that the safe hibernation location 202 need not be physically contiguous (as generally represented in FIG. 2), but can be allocated in a plurality of locations, including specific physical regions, to accommodate the drivers' or hardware devices' needs. Controlling the memory locations provided to the drivers 210, 212 and complying with other requirements as described below allows existing, slightly-modified drivers to be used, instead of one or more specially-developed drivers (which although feasible, is significantly more work than is needed). Similarly, it is feasible to use two or more processors and/or special hardware, and/or processors that support multiple threads of execution, to further increase the speed of entering (and resuming from) hibernation.
To obtain the requirements of the drivers 210, 212 and thereby control their memory usage, the hard disk driver 212 (e.g., ATAPI.SYS) and the DMA controller driver 210 (e.g., PCIIDEX.SYS) have been extended to include special interface functions that, when called by the hibernation engine 200, will return the requirements and details of the respective driver and hardware it controls. Note that other DMA drivers and hard disk controllers, e.g., SCSI-based, ATA100-based, or others can similarly provide limitations and details that will allow their memory usage to be controlled. One such detail that is returned is whether the hard disk supports DMA and whether DMA is enabled, which if not, will result in the PIO mode being used. Note that if the hard disk driver 212 or the DMA driver 210 do not export this new requirements-related function, or if the function returns with a result that DMA operation is not allowed/supported, the I/O will be performed in the old way, i.e., by the PIO mode with no attempt to do DMA. Similarly, if the kernel fails to allocate I/O buffers satisfying the requirements of the hard disk driver 212 and DMA driver 210 as described below, hibernation will be performed via PIO.
Other information returned by the drivers 210, 212 includes the amount of temporary memory required for storing any driver-specific data structures during operation. With this information, before beginning the data write operations, the hibernation engine 200 can request the operating system kernel to allocate an appropriate amount of memory for the respective drivers in the hibernation-safe area or areas 202. This ensures that the drivers 210, 212 will not need additional memory once the process of writing data for entering hibernation has begun, (because to freeze the non-hibernation memory contents for hibernation, allocation services and the like will not be allowed to operate). To this end, when hibernation is being started but before the data writing starts, the hibernation engine 200 first calls the drivers 210, 212 to find out how much memory they need, then has an appropriate amount of memory allocated therefor at safe hibernation locations 202, and gives each driver a respective pointer to its hibernation-safe memory location. When the drivers 210, 212 receive the respective pointers to the safe hibernation location or locations 202, the drivers 210, 212 initialize their internal data structures at those locations. By allocating sufficient memory based on their specified needs, the device drivers 210, 212 will have their data buffers stored in memory that is already allocated to them, so that no memory mapping/moving will be required when later I/O requests are processed. Note that it is relatively straightforward to extend the functionality of the drivers 210, 212 to comply with hibernation engine 200 requirements, because existing hard disk controller and DMA controller drivers do not use global variables for storing their internal data structures, but rather request and receive a pointer to memory for their own purposes. Consequently, the initialization phase is not different from a regular scenario, except for the initial providing of the drivers' requirements and the memory being allocated in the hibernation-safe memory location or locations 202.
In addition to memory size requirements, the extended driver functions also return the maximum size of allowed I/O requests, because hard disk and DMA controllers have a limit on a size of an I/O request that they may execute as a single atomic operation. Staying under the limit ensures that the drivers 210, 212 will not need kernel services (which will not be available) to handle larger I/O requests. Another detail returned is any maximum I/O buffer address, to handle some older hard disks and DMA controllers that cannot write to or read from memory above 16 MB, e.g., some types of physical memory require special mapping before DMA-based I/O operations in order to make such memory visible for the DMA controller. In addition, because most hard disks and DMA controllers require that an I/O memory block starts at an aligned (even) address, (e.g., 4 KB), alignment requirements of the I/O buffer are obtained. Similarly, because most hard disks and DMA controllers require some even I/O request size, this information is also returned. Still further, because most DMA controllers do not allow I/O memory blocks to cross a 64 KB boundary, any such limitations of the I/O buffer layouts are obtained, whereby the hibernation engine 200 can arrange a compression (output) buffer 218 that meets the drivers' requirements.
With the information from the drivers, the hibernation engine 200 can match the requirements and limitations of the hard disk controller 140 and DMA controller 204 so that DMA hibernation is possible without requiring complex kernel services during the I/O. In other words, as long as the drivers' allocated data locations are appropriately sized in a safe memory range, along with a properly aligned output buffer 218, and that the hibernation engine 200 splits any large I/O requests into sequence of smaller ones that otherwise match the hardware limitations, DMA requests will not cause any additional operations with the I/O buffer or I/O request that may violate the memory-related rules of hibernation. Note that instead of using generic memory, the DMA-based I/O functions use the hibernation-safe memory location supplied via the pointer from the calling hibernation engine 200. Since there is no need for the use of complex kernel services for dequeuing pending I/O requests, no dequeuing is performed.
The hibernation engine starts the I/O operation by calling the DMA driver 210, which in turn instructs the DMA controller 204 to start writing to the hard disk controller 140, and instructs the hard disk controller 140 to write a data sequence from the DMA controller 204 to a specific location on the hard disk 141. The driver works by getting a new I/O (IOSTART) request, including parameters for the new I/O request, including the size of the requested I/O operation, the target location on the hard disk, the address of the beginning of the I/O memory block, along with an address in the hibernation-safe memory region 202 for internal driver purposes. For safety reasons, the driver function verifies that each I/O request is compliant with the requirements imposed by the hardware it handles. If compliant, the function passes the request directly to lower-level routines without any transformations. If a request does not comply, the request is rejected and the driver returns STATUSINVALIDIOREQUEST.
When the I/O request is completed or an error is detected, any operations on the DMA and hard disk controllers are stopped, and either a STATUSOK or STATUSERROR are returned by the driver 210, respectively. When processing requests, DMA-hibernation functions avoid using any system services or functions that send requests directly to respective hardware. If the I/O operation is not completed instantly and waiting is required, the driver 210 saves its state in its hibernation-safe memory location, and returns a STATUSPENDING code to signal that I/O needs to be resumed later.
In other words, because the interrupts are disabled, instead of returning control back to the system when waiting is required, and proceeding further when an acknowledgment interrupt is received, each time the polling function has to wait, it remembers its state (into a hibernation-safe memory location) and the drivers 210, 212 return control to the hibernation engine 200 with a status equal to STATUSPENDING. A driver assumes that the caller will need to call this polling function again at a later time in order to check on the pending I/O.
More particularly, as described above, instead of using interrupts to detect the end of the next stage of an I/O operation, the driver functions use a respective device status register 208. When the status register 208 has a BUSY bit turned on, i.e., the device is still executing a request, and an ERROR bit turned off, i.e., there are no hardware errors, the driver 210 remembers its state in a hibernation-safe memory location supplied by the hibernation engine 200, and returns a STATUSPENDING. If an ERROR bit is turned on, it means that the requested I/O operation failed, and the driver 210 returns STATUSERROR. The interface function should be called with an IORESUME instead of IOSTART the next time, until the previous I/O operation completes. Otherwise, this stage of I/O completed successfully, whereby the driver 210 returns a STATUSOK code so that caller may initiate a new I/O.
Turning to an explanation of the invention with respect to entering into hibernation, the flow diagrams of FIGS. 3 and 4A–4 b generally represent the steps taken by the hibernation engine 200. Beginning at step 300, the hibernation engine 200 tests whether the hibernation-related functions are present in the hard disk controller driver and DMA driver. If not, step 300 branches to step 314 where the known PIO mode of entering hibernation is attempted. If at step 300 the hibernate functions are present, step 300 branches to step 302, wherein the hibernation engine 200 calls the drivers' respective functions to obtain the requirements and limitations of the drivers and their respective hardware devices as discussed above.
Step 304 then evaluates the returned information to test whether DMA is supported by the particular hardware. If not, step 304 branches to step 314 where the known PIO mode of entering hibernation is instead attempted. Alternatively, if DMA is supported, step 304 branches to step 306 wherein the hibernation engine 200 calls the kernel services (note that they are still active at this time) to allocate memory for the drivers in the safe hibernation location in accordance with their requirements. Step 308 represents the requested allocation of memory for the compression buffer 218, including any alignment or special physical locations as specified by the drivers.
Step 310 represents a test of whether the allocation requests were successful. Note that in actuality, each separate allocation request has a returned status that may indicate an error, however for simplicity these are collectively shown by decision diamond 310. If there is an error (e.g., insufficient memory), step 310 branches to step 314 where the known PIO mode of entering hibernation is instead performed.
If the memory was successfully allocated, step 312 represents the passing of the pointers to the safe memory locations by the hibernation engine 200 to the drivers. Note that this can alternatively be provided later, such as in conjunction with the write request. Step 314 represents the protecting of the memory regions that are to be preserved, by disabling interrupts and shutting down the system services. System services can be effectively shut down by gaining exclusive control of the processor or processors and thereafter not relinquishing control until after hibernation resume operations (described below) are complete. The process continues to step 400 of FIG. 4A to begin compressing and writing the compressed data to the hard disk 141.
Step 400 represents the initializing of the buffer 218 in which compressed data will be placed prior to writing it, and step 402 tests whether the buffer is full, as discussed below, which initially is not the case. To compress the data, the compression engine 214 is called at step 404, with parameters providing the compression engine 214 with a pointer to the output buffer 218 and the starting location (e.g., selected page) of the system memory 216 to be preserved. Note that compression of the data is not necessary for hibernation, however compression significantly reduces data transfer times.
It should be noted that not all of the remaining (non-hibernation engine) memory necessarily needs to be compressed/saved during hibernation, only those pages that are not marked as invalid or discardable. For example, in contemporary operating systems, memory pages may be deallocated, copied to disk for swapping purposes, and so forth, and thus there is no reason to preserve such unneeded or already-preserved pages. Rather than compress and preserve such pages, the hibernation engine starts with a list of non-discardable physical (e.g., 4 KB) memory pages that should be saved, and (instead of advancing input pointer that points to memory), advances in the list of “pages-to-write” until the end of list is reached. Similarly, only those pages need to be restored. Thus, the system memory regions to preserve 216 may be considered only such selected pages.
To perform the data compression, any data compression engine is suitable, such as an LZ77-based compression engine that provides an acceptable compression ratio and compression speed. LZ77 compression is a well-known data compression technique which generally operates by looking for matching bit or byte patterns in previous locations within a size-limited window, (e.g., 64 KB), and then replacing the matching pattern with a length, offset pairing that identifies the pattern. If no sufficient matching pattern is found, the pattern itself (one or more literals) is stored in the output compression buffer 218. In general, the larger the window the better the compression ratio, but the longer it takes to search for matching symbol strings.
The compression engine 218 can be provided with data to compress and polled for completion, however it is also straightforward to arrange a compression engine 218 with a callback mechanism to call a function of the hibernation engine 200 after some number of bytes are compressed, where the number may be specified by a parameter provided by the calling hibernation engine 200. In this manner, when compressing data, the pending I/O status can be regularly checked at a frequency controlled (primarily) by the hibernation engine 200. Step 404 represents the call by the hibernation engine 200 to the compression engine 214, which compresses the data into the output compression buffer 218. With the call, the compression engine 214 is passed a pointer to call back the hibernation engine 200, so that when the amount of data is compressed the drivers can be again polled until the last pending I/O completes.
Following the call back from the compression engine 214, step 406 represents the checking of the disk status, which initially should be ready. If ready, step 410 is executed to notify the drivers 210, 212 of any completed I/O, as needed. Step 412 checks whether there is enough compressed data to perform the write, e.g., based on hardware requirements and/or otherwise for efficiency, such as to write larger blocks per request rather than several smaller ones. If there is not enough data, the process loops back to step 404 to call back the compression engine 214 to compress more data. Note that with appropriate buffer sizes, it is not possible to have a situation wherein the buffer 218 is full yet there is not enough data to write.
When there is enough data to perform a write as determined via step 412, step 414 then tests whether the data ready in the output buffer 218 exceeds the amount that the hardware can handle, which as discussed above was provided to the hibernation engine 200 via the calls to the drivers. If so, step 414 branches to step 416 to set the write request to the hardware limit, and thereby essentially segment this request so that the hardware drivers do not need the complex kernel services (now unavailable) to handle the request. Step 418 represents the DMA write request to the DMA controller driver 210. Note that a circular output buffer that is properly aligned and has a size that is a multiple of the minimum may be used, (e.g., the maximum I/O request size is typically 4 KB, with a non-crossable boundary, typically equal to 64 KB). Disk writes are in sizes that are a multiple of the minimum I/O request size (typically 512 bytes), but do not exceed the maximum I/O request size, and the I/O request address range does not cross a boundary. This takes advantage of the fact that the maximum I/O request size (equal to 4 KB for conventional CPUs) is a multiple of the minimum I/O request size (equal to 512 bytes sector size on most hard disks), which, in turn, is multiple of buffer address alignment (two bytes for PIIX DMA controller), which is a multiple of request size alignment (two bytes for PIIX DMA and IDE hard disks). The circular buffer comprises three possibly empty parts that are circularly contiguous, namely compressed data ready to write, unused, and compressed data being written.
In keeping with one aspect of the present invention, while waiting for the write to complete, if there is more data to compress, step 420 loops back to step 402 to enable parallel data compression, provided that there is room in the output buffer 218 for the compressed data at step 402. Note that because the hibernation engine 200 controls the output buffer location into which the compression engine 214 will compress the source data, and the amount of data to compress, the hibernation engine 200 ensures that any already-compressed data will not be overwritten in the compression output buffer 218 until it has been successfully written to disk. The hibernation engine 200 may thus wait as necessary for the next write to complete to free up enough space in the buffer 218. The hibernation engine 200 can assume a worst-case data compression scenario (e.g., that actually expands rather than compresses the data) when evaluating whether there is room in the output buffer 218.
Once a DMA write is started via step 418, step 406 represents the subsequent polling calls by the hibernation engine 200 to have the driver check the status register 208, as described above. While the write request is pending, step 406 loops back to step 402 to enable further data compression to occur in parallel with the write, until the status changes to ready or failed. If the DMA I/O write operation fails, (the driver returned STATUSERROR), at step 408 the hibernation engine 200 may retry the request some number of times, (e.g., once more), and if still failing will try to switch to PIO mode and continue operation in PIO mode. If the PIO-mode I/O also fails, this is considered an unrecoverable error, whereby the system will not enter hibernation, but instead will shut down and next time attempt to return in a full power-on mode. If at step 406 the write was successful, i.e., the status is ready, the hibernation engine 200 notifies the drivers 210, 212 at step 410, and proceeds via steps 414420 to repeat the process and start the next write.
Eventually, there will be no more data in the memory locations 216 to compress at step 420, and the process will continue to step 430 of FIG. 4B, which represents polling the status for the previous write request to complete, again handling any failure via step 432. If the disk status is ready, step 434 calls the drivers 210, 212 to notify them of the success. Step 436 then tests whether there is more data in the compression output buffer 218 to write. If not, the memory contents have been saved to the hibernation file 206 and the process continues to step 448, described below. If there is more data in the output buffer 218 to write, step 438 tests whether there is enough to write, and if not, branches to step 440 to fill an otherwise inadequate write request with blank data or the like if needed to meet a hardware minimum.
If there is enough data to write, step 438 branches to step 442 to test the amount of data that can be written against the hardware limit. If too large, step 442 branches to step 444 to set the write request to the hardware limit, and thereby essentially segment this request. Step 446 represents the DMA write request to the DMA controller driver 210. Following the write request, the process returns to step 430 to await the write status as described above.
When all data has been compressed (step 420) and there is no more data in the output buffer to write (step 436), step 448 is executed. Step 448 represents the marking of the hibernation file 206 as valid (tested upon resume, as described below), the closing of the hibernation file 206, and the outputting of the instructions (e.g., to various registers) that put the machine hardware into the proper mode so as to awake when one or more various criteria are satisfied. In the above manner, hibernation can be rapidly achieved via parallel data compression and DMA disk writing.
Resuming from hibernation, such as when the user presses a power button, can proceed as before, e.g., via PIO. However, the present invention also provides two significant improvements to the resume from hibernation operation, which can be combined or used separately to resume much faster than with conventional PIO and decompression models. A first improvement, described below, deals with the use of DMA to read the hibernation file 206 and thereby allow parallel decompression to occur. A second improvement, also described below, provides a significantly faster decompression process than those previously used.
FIG. 5 and the flow diagrams of FIGS. 6 and 7 represent the resume components and process when performed with DMA. To implement DMA resume, a system loader process 500 has been modified to check the hibernation file 206, and if marked valid, read in (e.g., via PIO) the components needed to perform DMA, e.g., the hibernation engine 200 (or a resume equivalent) along with the DMA controller driver 210 and hard disk driver 212. Note that PIO reading (no DMA) is presently (and exclusively) performed during resume from hibernation, and thus will not be described in detail herein.
In general, upon startup, the initial system loader 500 first checks the status of the hibernation file 206 as represented in FIG. 6 via step 600. If not valid, via step 602 the system will boot normally, e.g., without attempting to restore any hibernated state, since this normally means that the user did not enter into hibernation when the machine was powered down. Another possibility is that the hibernation file 206 is marked as being in a “Resuming” state. As described below, this means that the hibernation resume was attempted, but failed for some reason. This is accomplished when step 600 determines that the hibernation file 206 is valid, and step 604 marks the file as “Resuming,” (e.g., via a PIO write), whereby if the resume process fails during memory restoration the user will be prompted to retry the resume or direct the machine to boot normally. This gives the user control rather than forever attempting to resume from a failed hibernation. Note that later, if the resume operation was successful, the hibernation file 206 will be marked as invalid.
Step 606 represents the reading in of the hibernation engine 200, a decompression engine 502, and the drivers that enable DMA. The system loader can read this into a hibernation resume safe memory region 504, or once loaded the hibernation engine 200 can move itself and the other components around as needed. Once loading is complete, the system loader 500 then calls the hibernation engine 200 as represented by step 608. Note that the system loader 500 (or the hibernation engine 200) can load the decompression engine 502 based on the type of compression that was used, if known in advance or is otherwise determinable, (e.g., via metadata associated with the hibernation file 206). For example, as compression technology has evolved, different decompression engines are needed, and the present invention can selectively load a matching one so that new technologies can be used, yet backward compatibility is retained.
At step 610, the hibernation engine 200 initializes one or more decompression input buffers 506 to prepare for reading and decompressing of the hibernation file 206. The hibernation engine 200 then proceeds to step 700 of FIG. 7 to call the DMA controller driver 210 to read in the data for decompression and restoration thereof via DMA read request, passing a pointer to the input buffer 506. The hibernation engine 200 uses a logically circular input buffer 506 or the like that is sufficiently large with respect to the original compression window so that a read can occur into one location while another location is being decompressed without the possibility that data (a pattern) that is needed for decompression will be overwritten by the next read. In this manner, the buffer can be safely filled in one buffer location while the data in another buffer location is being decompressed. Further, note that the read size may be adjusted for segmenting if a logically circular buffer is being used so as to not read beyond the physical end of the buffer.
Step 702 checks whether the buffer 506 is ready with a sufficient amount of data to start decompression, because, for example, there may not be any data in it, or because the reads may have to be segmented to match hardware limitations, and it may be more efficient to decompress data in large amounts per call rather than regularly call the decompression engine with only small amounts. If more data is needed, step 702 bypasses calling the decompression engine, otherwise step 704 is executed to call the decompression engine 502 to decompress some amount of data. The decompression engine 502 may be similar to the compression engine as described above with respect to its called functions, e.g., passed a pointer to the appropriate location in the input buffer 506, an amount (e.g., number of compressed bytes) to decompress, and a callback pointer for calling the hibernation engine 200 when finished. The decompression engine 502 is given a pointer to an output buffer, which may be a temporary restore buffer or directly at a location (e.g., in a selected page) in the system memory 216 that is being restored.
Step 706 represents the polling for the DMA status, (similar to that described above with respect to the write request), with step 708 representing the handling of the failure case. If a failure occurred, one or more DMA or PIO retries may be made, which if also unsuccessful will cause a restart of the system, this time with the hibernation file 206 marked as “Resuming” status. If the status is pending (busy), the process branches back to step 702 to attempt to decompress more data. If ready, the process continues to step 710 to notify the drivers.
Step 712 tests for whether the hibernation file 206 has been fully read into the input buffer 506. If not, step 712 branches to step 714 to determine whether there is adequate room in the input buffer 506 for further data to be read. If there is not enough room, step 714 branches to step 704 to decompress some more data. If there is room, step 714 returns to step 700 to read in another amount. Note that in addition to detecting the end of the file 206, the system loader/hibernation engine 200 can also detect out of memory situations, such as when memory was physically removed since hibernation was entered or the hibernation file 206 “appears” on a different machine (because the hard drive was transferred) without sufficient memory. For example, in situations when it turns out that there is more data to decompress than memory for outputting it, the hibernation file 206 can be marked invalid and the system restarted.
If at step 712 the hibernation file 206 has been completely read in, steps 716, 718 and 720 are executed to complete the resume from hibernation process, including decompressing the remaining data in the input buffer 506, marking the hibernation file 206 as invalid (so that it will not be used again unless hibernation is explicitly re-entered), and to exit hibernation. Exiting hibernation essentially comprises deallocating as much of the hibernation resume safe memory 504 as possible and relinquishing control at the hibernation engine 200 to the processes that were executing when hibernation was entered. At this time, the system memory 216 (e.g., each of the selected pages) is restored. Also, the state of devices, services and so forth are restored, and the computer operates as it was before hibernation was entered.
In accordance with another aspect of the present invention, a significantly faster LZ77 decompression decoder is provided that can be used with data decompression generally, including when resuming from hibernation (whether PIO resume or DMA-based resume). In general, the faster decoder operates by reducing the number of bounds checks on the buffers without allowing the output buffer to overflow.
It was previously believed that to decompress LZ77-encoded data, regular bounds checks are necessary to deal with data that may possibly be corrupted. More particularly, a reliable decompression decoder needs to handle not only valid and correctly encoded data, but also corrupted data. If encoded data is corrupted, a reliable decoder either detects or reports an error, or, if an error cannot be detected easily, continues execution without crashing or causing system instability. Note that the use of checksums and the like detect data corruption only probabilistically, and thus are not guaranteed to be perfect, as there is always a chance that a checksum will miss data corruption.
To be reliable, an LZ77 decoder has to ensure that the input buffer has enough data, that each substring referenced by a pointer in LZ77 code actually exists, and that the output buffer has enough space to hold decoded symbols. If any of these rules are not enforced, corrupted data may cause the decoder to access memory outside of its input/output buffers, causing a memory protection fault or overwriting other key data, thus crashing or destabilizing the system. Much of the processor time used by known decoders is spent checking the bounds to ensure that decoding operations stay within the buffers.
FIGS. 8–11 are flow diagrams that explain how the fast decoder (e.g., 508) of the present invention substantially increases the decompression speed by reducing the number of bounds checks for the majority of the data to be decompressed, yet does so safely. The remaining data that cannot be safely decompressed with the fast decoder 508 is decompressed with a standard decoder 510. Note that both the standard decoder 510 and fast decoder 508 are relatively small, and thus it is not a significant burden on a contemporary system to have two types of decoders loaded into memory at the same time. Since the standard decoder 510 is used when the fast decoder 508 approaches the end of the buffer, the fast and standard decoders can be considered a two-phase decoder, and can be incorporated into the decompression engine 502 of FIG. 5.
The first phase (fast) decoder 508 benefits from common statistical properties of typical encoded data (texts, documents, executable programs, object files, libraries, HTML/XML pages, database files, and so forth) in order to considerably reduce amount of bounds checks needed on average. One feature that the present invention leverages is that in typical data, most matching substrings are relatively short. Indeed, in typical encoded data, approximately ninety-five percent of matching substrings do not exceed eight symbols in length, while the average length of a matching substring is approximately five symbols.
In general, the fast decoder 508 of the present invention more efficiently handles the decoding of these short substrings and literal symbols by eliminating the output buffer bounds checks for them and replacing the conventional symbol-by-symbol copying (that checks for the end of the substring after each copied symbol) with a technique that unconditionally copies a fixed amount of symbols. To do this safely, the fast decoder 508 guarantees that the output buffer will have enough space left to hold short substrings and literal symbols. When the buffer is almost full and this guarantee is not possible, the decompression process switches to the second decoding phase that finishes the decoding using a standard decoder 510.
During the first phase of decompression, the fast decoder 508 guarantees that it has enough space in the output buffer for some number N of symbols, chosen herein to be space for eight (8) more symbols. The choice of the actual value may vary depending on a particular implementation and hardware. FIG. 8 shows a general process that sets up and then calls the fast decoder 508 in a first phase, and the standard decoder 510 in a second phase, to decode some amount of data. Step 800 represents the initialization of the input and output buffers. A safe bound, maximum safe input position is calculated to be the maximum output position+7)/8, where the values match the N=8 value chosen above. Step 804 then calls the fast decoder, beginning at step 900 of FIG. 9.
The guarantee is accomplished by periodically adjusting an input buffer bound limit. To do this, the input buffer bound is set to minimum of the input or output buffer size divided by N, wherein in the present example N=8. When the decompression engine reaches this limit, or needs to decode a substring of more than N (=8) symbols, it readjusts the limit by increasing the limit by the minimum of the remaining part of the input or output buffer divided by N. If the input buffer is fully exhausted or the output buffer does not have space for at least N (=8) more symbols, the first phase finishes. If the result was OK (e.g., the data was not corrupted as described below), the remaining (N−1) or less symbols are decoded by the second phase, the standard decoder 510. This is represented in FIG. 8 by steps 806810.
FIGS. 9–11 shows the process of the fast decoder 508, beginning at step 900 wherein a safety check is made that the actual input position is not greater than or equal to the maximum safe value (first calculated at step 802). At this time, the input position is zero, so step 900 branches to step 1000 of FIG. 10.
Step 1000 of FIG. 10 selects as a token the next element (a literal symbol or length, offset pairing) from the input buffer (e.g., 506), and then at step 1002 moves the input position based on the size of the token. Step 1006 checks whether the token is a literal symbol. If so, the token is copied to the output buffer at step 1008, without any bounds check. Note that during the first phase of decoding, it is guaranteed that the output buffer always has enough space to store at least N more symbols. This is done without increasing the number of comparisons, because before decoding, the next token decoding operation ensures that the input buffer limit is not crossed. As a result, the first phase decoder 508 need not check the output buffer bound each time a literal symbol is decoded, because it knows that output buffer has enough space to store one more symbol (actually N more symbols) since it is guaranteed by the fact that the input position is less than the maximum safe value. Step 1010 adjusts the position in output buffer to prepare for the next symbol.
If at step 1006 the token was not a literal, step 1006 ranches to step 1012 to obtain the length and offset values from the token. Step 1014 is a check that ensures that the offset is not greater than the output position. Note that in LZ77 compression, any given offset should point to a previous location in the already decoded data in the output buffer. If not, the compressed data is corrupted, and step 1014 branches to step 1016 to report this and return to FIG. 8 to halt the decompression.
Normally the compressed data is not corrupt, and step 1018 is executed to check whether the length value is less than or equal to the N=8 value that was previously selected. If so, the process continues to step 1100 of FIG. 11 as described below. Most of the time, however, (e.g., approximately ninety-five percent for typical data), the string length is eight symbols or less, and step 1018 branches to step 1020 wherein eight symbols are copied starting from the offset. The output position pointer is then moved the actual length.
In order to reduce the number of comparisons needed during the copying of a substring, the first phase decompression engine processes substrings of N or less symbols and longer substrings differently. If the length of a substring is N or less, then the decompression engine copies N (e.g., eight) symbols into the output buffer (which is guaranteed to have enough space to hold N more symbols), and then adjusts the output position (pointer) by the actual length (e.g., five) of the substring. Note that since N is small, copying of the few extra symbols (which will then be overwritten) is normally faster than copying the substring symbol by symbol in a loop while comparing the number of symbols copied so far against the actual amount needed.
When a literal token is copied or a substring of less than length N (=8) is copied, step 1024 is executed to determine whether there is another token to decode. If not, the fast decoding is done and step 1026 sets the decode status to “OK” or the like and returns to FIG. 8. Otherwise, the process returns to FIG. 9, step 900. Step 900 is again evaluated to determine whether the (now non-zero) input position has exceeded the maximum safe position. Since the input position keeps increasing, eventually it will exceed the maximum safe position at step 900, at which time step 900 will branch to step 902 to calculate a value representing a number of tokens that can be safely decoded for literals and small (length<=8) substrings. If at step 904 this number is zero, then the output position is too close to the end of the output buffer to use the first phase recognition engine, and the decode status is set to OK and the process returns to FIG. 8 to decode any remaining data via the standard decoder 508.
When there is still a safe amount of room in the output buffer at step 904, step 908 sets a new maximum safe input position equal to the actual input position plus the number of safe tokens. If at step 910 the maximum safe input position is not exceeded by a maximum value allowed for it, the process branches to FIG. 10 to decode another literal or substring as described above. If however, the maximum safe input position is exceeded by the maximum value, step 912 sets it to the maximum value. Step 914 checks whether the actual input position exceeds this maximum value. If so, the data is corrupt (step 916). If not, the process branches to FIG. 10 to decode another literal or substring as described above.
FIG. 11 represents the steps generally taken when a substring is greater than eight symbols in length, beginning at step 1100 wherein a last output position variable is set to the output position plus the length. Step 1102 checks whether the last output position is greater than the maximum output position allowed. If so, the data is corrupt, and step 1102 branches to step 1110 to set the status to corrupt and return to FIG. 8. Normally this is not the case, and step 1102 branches to step 1104.
At step 1104, the decompression engine compares the length with a MAXLENGTH escape code value, which, in general, is a special code indicating that the length is greater than a more simply represented length value, and the actual length has to be decoded from other bits. If so, step 1106 is executed. In step 1106, the decompression engine efficiently uses a sufficiently large substring by copying MAXLENGTH symbols to the output buffer, until the number of uncopied symbols remaining is not less than MAXLENGTH. When the decompression engine copies MAXLENGTH symbols to output, the output position counter is incremented by the MAXLENGTH value, and a counter of symbols left to copy is decremented by MAXLENGTH. Any remaining (MAXLENGTH−1) or less symbols are copied one by one via steps 1112, 1114, 1116 and 1118, wherein the last output position value represents where the output position should be when the last symbol is to be decoded. With this approach, the decompression engine guarantees that the copying of substring of L symbols will require no more than (Length/MAXLENGTH+MAXLENGTH−1) comparisons—effectively reducing number of comparisons in almost MAXLENGTH times for a sufficiently large Length value. Steps 1108 and 1110 handle the situation of not enough data in the input buffer for a given length, which is corrupted data. When finished decoding the long substring, the process returns to FIG. 9 as described above.
In this manner, by guaranteeing that the output buffer always has enough space to store at least N more symbols and then copying one (literal) or N symbols for small lengths less than or equal to N without bounds checking or symbol counting/comparing, the total number of comparisons can be substantially reduced. In one implementation, the use of the fast two-phase LZ77 decoder implementing the present invention improved actual decoding speeds relative to a conventional decoder by about twenty to forty percent (depending on the actual compressed data and hardware used), primarily by reducing the number of comparisons (by about a factor of two).
While the present invention significantly reduces hibernation entering and resuming times, hibernation still takes a noticeable amount of time. In accordance with another aspect of the present invention, one way in which the perceived time may be shortened, yet without losing the power-safe benefits of hibernation, is to combine a suspend mode with a background hibernation process. FIG. 12 shows a way in which this can be accomplished, for example, beginning at step 1200 when a user “shuts down” a machine, with an option or the like that puts the machine into a suspend or sleep mode (sometimes referred to as an “S4” state). This option may be user-selectable (e.g., “suspend with hibernation” versus normal “suspend”) or automatic, e.g., suspend is backed by hibernation without the user being explicitly aware of it.
A suspend/sleep state may be generally accomplished by providing sufficient power to the volatile memory to maintain its contents and shutting down the rest of the hardware devices. Hibernation that backs the suspend state may be further accomplished by removing power to the devices not needed for hibernation (step 1202) while temporarily maintaining power to the hard disk (controller and drive), and preferably the DMA controller. Note that this gives the appearance of instantaneously entering suspend, e.g., the display goes dark, speakers go silent and so forth, (although a perceptive user may notice the hard drive operating for a while longer). The remaining powered-up hardware is then used (e.g., via the DMA process as described above or via a PIO process) to automatically write the memory contents to a valid hibernation file at step 1204. Once the hibernation file is written and marked valid, the hard disk hardware and DMA controller are powered off, as represented in FIG. 12 via step 1206. The system is now in a suspended mode as previously understood, but with a valid hibernation file backing the memory contents as if hibernating.
Step 1208 represents the user activating the machine. Because the system is suspended/asleep, if there was no power outage and the memory is still intact at step 1210, the system may instantly resume from its sleep/suspend state, and marks (e.g., via the hardware/firmware that enables resuming from suspend) the hibernation image as “invalid” since it is not needed (step 1212). However, if a power outage occurred and the user turns the machine back on, the hardware/firmware will load the system loader process (step 1214), which in turn will see a valid hibernation file (step 1216) and resume from hibernation as described above. At step 1218, the hibernation file is marked as invalid by the hibernation engine when the resume is successful, as also described above.
In this manner, the user perceives a near-instantaneous shut down, and a near-instantaneous restart in most cases where the power was not interrupted. However, if power is lost, while the system will not resume operation as fast as from suspend, the user does not lose any of the machine state. As can be readily appreciated, the advantages of both suspend (instant resume) and hibernation (resistance to power outages) are normally achieved, without their disadvantages.
As can be seen from the foregoing detailed description, there is provided improved hibernation for computer systems. Entering and resuming from hibernation is significantly faster by DMA and/a faster decoder. Hibernation may be combined with suspend/sleep modes to provide near instantaneous state preservation yet with the resistance to power outages provided by hibernation.
While the invention is susceptible to various modifications and alternative constructions, certain illustrated embodiments thereof are shown in the drawings and have been described above in detail. It should be understood, however, that there is no intention to limit the invention to the specific form or forms disclosed, but on the contrary, the intention is to cover all modifications, alternative constructions, and equivalents falling within the spirit and scope of the invention.

Claims (9)

1. A computer-implemented method, comprising:
disabling interrupts on a computer system for entering a state of hibernation;
preparing a first set of data for writing a first set of content of volatile memory to a disk;
instructing a controller to write the first set of data asynchronously to the disk;
polling, in intermittent polling operations, a status register to determine when the write to the disk is complete; and
while between polling operations, preparing a second set of data for writing a second set of content of volatile memory to the disk.
2. The method of claim 1 wherein preparing the first set of data for writing includes compressing the data.
3. The method of claim 1 wherein preparing the first set of data for writing includes segmenting the data.
4. The method of claim 1 further comprising, calling at least one driver to obtain information about write limitations of the disk.
5. The method of claim 1 further comprising, reserving at least one special section of memory for preparing the first set of data for writing.
6. The method of claim 1 further comprising, calling a disk driver to obtain information about memory requirements of the driver.
7. The method of claim 6 further comprising, reserving at least one special section of memory for the disk driver based on the memory requirements thereof.
8. A computer-implemented method, comprising:
disabling interrupts on a computer system;
instructing a controller to read data asynchronously from a disk to an input buffer;
polling, in intermittent polling operations, a status register to determine when the read from the disk is complete; and
while between polling operations, decompressing at least some data in the input buffer, wherein decompressing at least some of the data in the input buffer comprises, accessing the data including tokens comprising literal symbols and length, offset pairings, ensuring that space to hold a predetermined number of at least two output symbols is present in an output buffer, selecting a token from the input data, and when the token comprises a literal symbol, copying the literal symbol into the output buffer without checking whether the symbol will fit into the output buffer, or when the token comprises a length, offset pairing, copying at least the predetermined number of symbols into the output buffer without checking whether each symbol will fit into the output buffer.
9. The method of claim 8 further comprising, loading at least one driver to enable the controller to read the data asynchronously from the disk.
US11/010,778 2001-03-21 2004-12-13 Hibernation of computer systems Expired - Lifetime US6963934B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/010,778 US6963934B2 (en) 2001-03-21 2004-12-13 Hibernation of computer systems

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/814,474 US6883037B2 (en) 2001-03-21 2001-03-21 Fast data decoder that operates with reduced output buffer bounds checking
US11/010,778 US6963934B2 (en) 2001-03-21 2004-12-13 Hibernation of computer systems

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/814,474 Continuation US6883037B2 (en) 2001-03-21 2001-03-21 Fast data decoder that operates with reduced output buffer bounds checking

Publications (2)

Publication Number Publication Date
US20050108589A1 US20050108589A1 (en) 2005-05-19
US6963934B2 true US6963934B2 (en) 2005-11-08

Family

ID=25215145

Family Applications (4)

Application Number Title Priority Date Filing Date
US09/814,474 Expired - Lifetime US6883037B2 (en) 2001-03-21 2001-03-21 Fast data decoder that operates with reduced output buffer bounds checking
US11/010,778 Expired - Lifetime US6963934B2 (en) 2001-03-21 2004-12-13 Hibernation of computer systems
US11/010,504 Expired - Lifetime US7305502B2 (en) 2001-03-21 2004-12-13 Compressing an amount of uncompressed data less than amount of hibernation related data written to storage via DMA controller when requested write is pending
US11/010,498 Expired - Lifetime US7634592B2 (en) 2001-03-21 2004-12-13 Method, system, and program for hibernation of computer systems having DMA controller to asynchronously read data from buffers

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/814,474 Expired - Lifetime US6883037B2 (en) 2001-03-21 2001-03-21 Fast data decoder that operates with reduced output buffer bounds checking

Family Applications After (2)

Application Number Title Priority Date Filing Date
US11/010,504 Expired - Lifetime US7305502B2 (en) 2001-03-21 2004-12-13 Compressing an amount of uncompressed data less than amount of hibernation related data written to storage via DMA controller when requested write is pending
US11/010,498 Expired - Lifetime US7634592B2 (en) 2001-03-21 2004-12-13 Method, system, and program for hibernation of computer systems having DMA controller to asynchronously read data from buffers

Country Status (1)

Country Link
US (4) US6883037B2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110264769A1 (en) * 2010-04-27 2011-10-27 Yoneda Munehiro Content specifying apparatus and program of the same
US8412907B1 (en) * 2004-12-07 2013-04-02 Google Inc. System, method and computer program product for application-level cache-mapping awareness and reallocation
TWI559688B (en) * 2014-09-24 2016-11-21 英特爾公司 Technologies for efficient lz77-based data decompression

Families Citing this family (69)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6968469B1 (en) * 2000-06-16 2005-11-22 Transmeta Corporation System and method for preserving internal processor context when the processor is powered down and restoring the internal processor context when processor is restored
US7958289B2 (en) * 2002-08-08 2011-06-07 International Business Machines Corporation Method and system for storing memory compressed data onto memory compressed disks
US6879939B1 (en) * 2002-12-27 2005-04-12 Unisys Corporation Control of distribution of input/output operations during system testing
TW591372B (en) * 2003-05-15 2004-06-11 High Tech Comp Corp Power control method of portable electronic device, portable electronic device and electronic system
US7093089B2 (en) * 2003-08-18 2006-08-15 Hewlett-Packard Development Company, Lp. Systems and methods for storing data on computer systems
US7424632B2 (en) 2004-02-13 2008-09-09 Microsoft Corporation Systems and methods that facilitate state machine power and wake state management
US7428694B2 (en) 2004-03-02 2008-09-23 Stmicroelectronics S.A. Device for protection against error injection into a synchronous flip-flop of an elementary logic module
US7966424B2 (en) 2004-03-15 2011-06-21 Microsoft Corporation Data compression
US20060143485A1 (en) * 2004-12-28 2006-06-29 Alon Naveh Techniques to manage power for a mobile device
US7664970B2 (en) 2005-12-30 2010-02-16 Intel Corporation Method and apparatus for a zero voltage processor sleep state
TWI237806B (en) * 2004-11-03 2005-08-11 Sunplus Technology Co Ltd Audio decoding system with ring buffer and method thereof
JP4357442B2 (en) * 2005-03-23 2009-11-04 株式会社東芝 Plan execution device, plan execution method and program
KR101135282B1 (en) * 2005-04-12 2012-04-12 파나소닉 주식회사 Video/audio processing device and apparatus connected to the same
US7349829B1 (en) 2005-05-27 2008-03-25 Reference Diagnostics, Inc. Data logger
US8001430B2 (en) * 2005-06-30 2011-08-16 Freescale Semiconductor, Inc. Device and method for controlling an execution of a DMA task
WO2007003984A1 (en) * 2005-06-30 2007-01-11 Freescale Semiconductor, Inc. Device and method for arbitrating between direct memory access task requests
US20090125647A1 (en) * 2005-06-30 2009-05-14 Citibank, N.A. Device And Method For Executing A DMA Task
EP1899825B1 (en) * 2005-06-30 2009-07-22 Freescale Semiconductor, Inc. Device and method for controlling multiple dma tasks
US20070124573A1 (en) * 2005-10-28 2007-05-31 Walker Phillip M Method for rapid startup of a computer system
US20070150760A1 (en) * 2005-12-22 2007-06-28 Nowlin Dan H Reducing the amount of memory contents saved to non-volatile storage
US7640440B2 (en) * 2006-04-25 2009-12-29 Apple Inc. Method and apparatus for facilitating device hibernation
JP2007334383A (en) * 2006-06-12 2007-12-27 Sony Corp Information processor, method of starting it, and program
US8018884B2 (en) 2006-06-21 2011-09-13 Qualcomm Incorporated Low duty cycle network controller
US8700105B2 (en) * 2006-06-22 2014-04-15 Qualcomm Incorporated Low duty cycle device protocol
EP2162412A1 (en) * 2007-05-08 2010-03-17 Vesta Research, Ltd. Shaped, flexible fuel and energetic system therefrom
US7831858B2 (en) * 2007-08-31 2010-11-09 Intel Corporation Extended fault resilience for a platform
JP2009146061A (en) * 2007-12-12 2009-07-02 Canon Inc Information processing device and start-up method of the device
US20090210888A1 (en) * 2008-02-14 2009-08-20 Microsoft Corporation Software isolated device driver architecture
US9185654B2 (en) * 2008-07-16 2015-11-10 Qualcomm Incorporated Network server having an information and scheduling controller to support one or more low duty cycle wireless devices
US7895466B2 (en) * 2008-09-02 2011-02-22 Telefonaktiebolaget L M Ericsson (Publ) DMA assisted data backup and restore
US8499174B2 (en) * 2008-09-08 2013-07-30 Via Technologies, Inc. Method and controller for power management
US8504850B2 (en) * 2008-09-08 2013-08-06 Via Technologies, Inc. Method and controller for power management
US8181046B2 (en) * 2008-10-29 2012-05-15 Sandisk Il Ltd. Transparent self-hibernation of non-volatile memory system
EP2244186A3 (en) * 2009-03-11 2010-11-10 Harman Becker Automotive Systems GmbH Computing device and start-up method therefor
JP4986247B2 (en) * 2009-06-02 2012-07-25 株式会社ユビキタス Program, control method, and control apparatus
US8694814B1 (en) * 2010-01-10 2014-04-08 Apple Inc. Reuse of host hibernation storage space by memory controller
US9063728B2 (en) * 2010-03-17 2015-06-23 Apple Inc. Systems and methods for handling hibernation data
KR20110110434A (en) * 2010-04-01 2011-10-07 삼성전자주식회사 Low power audio play device and mehod
JP5598144B2 (en) * 2010-08-04 2014-10-01 ソニー株式会社 Information processing apparatus, power supply control method, and program
KR101691091B1 (en) 2010-11-08 2016-12-30 삼성전자주식회사 Computing system and hibernation method thereof
US8694764B2 (en) 2011-02-24 2014-04-08 Microsoft Corporation Multi-phase resume from hibernate
KR101799292B1 (en) 2011-04-19 2017-11-20 삼성전자주식회사 Electronic device and booting method thereof
JP5783809B2 (en) * 2011-06-03 2015-09-24 キヤノン株式会社 Information processing apparatus, activation method, and program
DE112011105700T5 (en) * 2011-10-01 2014-07-17 Intel Corporation Quick Sleep and Quick Recovery for a platform of computer system
CN102521062B (en) * 2011-11-29 2015-02-11 西安空间无线电技术研究所 Software fault-tolerant method capable of comprehensively on-line self-detection single event upset
US8984316B2 (en) 2011-12-29 2015-03-17 Intel Corporation Fast platform hibernation and resumption of computing systems providing secure storage of context data
US8751757B1 (en) * 2011-12-30 2014-06-10 Emc Corporation Acquisition and kernel memory storage of I/O metrics
JP6061495B2 (en) * 2012-05-21 2017-01-18 キヤノン株式会社 Image forming apparatus, image forming apparatus control method, and program
JP6108710B2 (en) * 2012-08-06 2017-04-05 キヤノン株式会社 Information processing apparatus, information processing apparatus control method, and program
GB2520864A (en) * 2012-09-28 2015-06-03 Hewlett Packard Development Co Transition to an intermediate power state
JP6080492B2 (en) * 2012-10-25 2017-02-15 キヤノン株式会社 Information processing apparatus, activation method, and program
US9904559B2 (en) * 2012-10-25 2018-02-27 Canon Kabushiki Kaisha Information processing apparatus and activation method therefor for processing data of a hibernation image
JP6080490B2 (en) * 2012-10-25 2017-02-15 キヤノン株式会社 Information processing apparatus, activation method, and program
JP6080491B2 (en) * 2012-10-25 2017-02-15 キヤノン株式会社 Information processing apparatus, activation method, and program
US20140129751A1 (en) * 2012-11-07 2014-05-08 Taejin Info Tech Co., Ltd. Hybrid interface to improve semiconductor memory based ssd performance
US10565099B2 (en) * 2012-12-28 2020-02-18 Apple Inc. Methods and apparatus for compressed and compacted virtual memory
US9032139B2 (en) 2012-12-28 2015-05-12 Intel Corporation Memory allocation for fast platform hibernation and resumption of computing systems
IN2013DE02709A (en) * 2013-09-13 2015-06-26 Ge Healthcare Bio Sciences Ab
JP2015103077A (en) * 2013-11-26 2015-06-04 富士通株式会社 Arithmetic processing unit, information processor, and control method of information processor
TWI610239B (en) 2013-12-27 2018-01-01 財團法人工業技術研究院 Electronic apparatus and method for resuming from hibernation
JP6399916B2 (en) * 2014-01-20 2018-10-03 キヤノン株式会社 Information processing apparatus and control method thereof
US9287893B1 (en) 2015-05-01 2016-03-15 Google Inc. ASIC block for high bandwidth LZ77 decompression
US9940039B2 (en) 2015-05-21 2018-04-10 Western Digital Technologies, Inc. Method and data storage device with enhanced data retention
US10489163B2 (en) * 2016-06-06 2019-11-26 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Loading a program during boot of a device
US10452561B2 (en) 2016-08-08 2019-10-22 Raytheon Company Central processing unit architecture and methods for high availability systems
US10838922B2 (en) * 2017-03-31 2020-11-17 International Business Machines Corporation Data compression by using cognitive created dictionaries
US10649515B2 (en) * 2018-02-12 2020-05-12 Saudi Arabian Oil Company Power management of computing and communications systems during power fluctuation and sudden power failure events
US11314555B2 (en) 2019-08-26 2022-04-26 International Business Machines Corporation Synchronous re-execution of a data transformation operation to obtain further details regarding an exception
US11947420B2 (en) * 2021-12-15 2024-04-02 Google Llc Hardware memory error tolerant software system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4398299A (en) 1980-10-10 1983-08-09 Bell Telephone Laboratories, Incorporated Data set network diagnostic system
US5509124A (en) 1986-09-19 1996-04-16 International Business Machines Corporation Coupled synchronous-asychronous bus structure for transferring data between a plurality of peripheral input/output controllers and a main data store
US5841696A (en) 1997-03-05 1998-11-24 Advanced Micro Devices, Inc. Non-volatile memory enabling simultaneous reading and writing by time multiplexing a decode path
US5875310A (en) 1996-05-24 1999-02-23 International Business Machines Corporation Secondary I/O bus with expanded slot capacity and hot plugging capability
US6092140A (en) 1998-05-15 2000-07-18 Hewlett-Packard Company Low latency bridging between high speed bus networks
US6434648B1 (en) 1998-12-10 2002-08-13 Smart Modular Technologies, Inc. PCMCIA compatible memory card with serial communication interface

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4395299A (en) * 1981-08-21 1983-07-26 The United States Of America As Represented By The Secretary Of The Army Bonded bulk graphite and process for bonding
US5021983B1 (en) * 1989-11-13 1996-05-28 Chips & Technologies Inc Suspend/resume apparatus and method for reducing power consumption in battery powered computers
US5497494A (en) * 1993-07-23 1996-03-05 International Business Machines Corporation Method for saving and restoring the state of a CPU executing code in protected mode
US5511202A (en) * 1993-07-26 1996-04-23 International Business Machines Corporation Desktop computer system having zero-volt system suspend and control unit for ascertaining interrupt controller base address
JP3106401B2 (en) 1993-07-26 2000-11-06 インターナショナル・ビジネス・マシーンズ・コーポレ−ション Information processing system
KR970010634B1 (en) * 1994-10-25 1997-06-28 삼성전자 주식회사 Metwork hibernation system
US6678712B1 (en) 1996-01-19 2004-01-13 International Business Machines Corporation Method and system for executing a program under one of a plurality of mutually exclusive operating environments
JP3204251B2 (en) 1999-06-30 2001-09-04 インターナショナル・ビジネス・マシーンズ・コーポレーション Hibernation apparatus and method, recording medium storing the same, and computer to which the same is applied
US6732280B1 (en) * 1999-07-26 2004-05-04 Hewlett-Packard Development Company, L.P. Computer system performing machine specific tasks before going to a low power state
US6546472B2 (en) * 2000-12-29 2003-04-08 Hewlett-Packard Development Company, L.P. Fast suspend to disk
US6792550B2 (en) * 2001-01-31 2004-09-14 Hewlett-Packard Development Company, L.P. Method and apparatus for providing continued operation of a multiprocessor computer system after detecting impairment of a processor cooling device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4398299A (en) 1980-10-10 1983-08-09 Bell Telephone Laboratories, Incorporated Data set network diagnostic system
US5509124A (en) 1986-09-19 1996-04-16 International Business Machines Corporation Coupled synchronous-asychronous bus structure for transferring data between a plurality of peripheral input/output controllers and a main data store
US5875310A (en) 1996-05-24 1999-02-23 International Business Machines Corporation Secondary I/O bus with expanded slot capacity and hot plugging capability
US5841696A (en) 1997-03-05 1998-11-24 Advanced Micro Devices, Inc. Non-volatile memory enabling simultaneous reading and writing by time multiplexing a decode path
US6092140A (en) 1998-05-15 2000-07-18 Hewlett-Packard Company Low latency bridging between high speed bus networks
US6434648B1 (en) 1998-12-10 2002-08-13 Smart Modular Technologies, Inc. PCMCIA compatible memory card with serial communication interface

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Adler et al., About Info ZIP Home Page http://wilodsau.idv.uni-linz.ac.at/mfx/lzo.html Printed Mar. 9, 2001.
Buren et al., "A Micro Processing System Managing the IEC-Bus," Conference on Microprocessors in Automation and Communications, pp. 395-402 (Sep. 1978).
Fisher, Eugene, "Speed Microprocessor Responses," Electronic Design, vol. 23, No. 23, pp. 78-82, (Nov. 1975).
Oberhumer, Markus F.X.J., "LZO Data Compression Library," ftp://ftp.freesoftware.com/pub/infozip/index.html printed Mar. 9, 2001.
Shankar, S. "Data Transfer By Cycle Stealing in the 8080," Electronic Engineering (London), vol. 51, No. 624, pp. 20-21 (Jun. 1979).
Ziv et al., "A Universal Algorythym for Sequential Data Compression," IEEE Transaction on Information Theory, vol. 23, No. 3, pp. 337-343 (May 1977).

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8412907B1 (en) * 2004-12-07 2013-04-02 Google Inc. System, method and computer program product for application-level cache-mapping awareness and reallocation
US20110264769A1 (en) * 2010-04-27 2011-10-27 Yoneda Munehiro Content specifying apparatus and program of the same
TWI559688B (en) * 2014-09-24 2016-11-21 英特爾公司 Technologies for efficient lz77-based data decompression

Also Published As

Publication number Publication date
US6883037B2 (en) 2005-04-19
US20050149646A1 (en) 2005-07-07
US20020138669A1 (en) 2002-09-26
US7634592B2 (en) 2009-12-15
US20050097239A1 (en) 2005-05-05
US20050108589A1 (en) 2005-05-19
US7305502B2 (en) 2007-12-04

Similar Documents

Publication Publication Date Title
US6963934B2 (en) Hibernation of computer systems
US6438668B1 (en) Method and apparatus for reducing power consumption in a digital processing system
US7454653B2 (en) Reliability of diskless network-bootable computers using non-volatile memory cache
US8533449B2 (en) Multi-phase resume from hibernate
US7373530B2 (en) Systems and methods for providing power-loss protection to sleeping computers systems
EP0710375B1 (en) File backup system
US5651139A (en) Protected system partition read/write access on a SCSI controlled DASD
US8146093B2 (en) Computer multiple operation system switching method
US20030217299A1 (en) Power management system and method
JP2001516924A (en) Data transfer to non-volatile storage media
US5828877A (en) Circuit and method for optimizing creation of a compressed main memory image
JPH10228379A (en) Method and computer system for executing architecture specification code having reduced run-time memory space request
WO2006036375A1 (en) Operating system transfer and launch without performing post
JPH04362717A (en) System restarting device
US6901481B2 (en) Method and apparatus for storing transactional information in persistent memory
US6014681A (en) Method for saving a document using a background save thread
US20040230784A1 (en) Concurrent program loading and execution
US20060206652A1 (en) Machine state storage apparatus and method
EP1293904A1 (en) Method of initiating a backup procedure
WO2010116405A1 (en) Calculation system provided with nonvolatile main memory
JP6080492B2 (en) Information processing apparatus, activation method, and program
JP2003044284A (en) Activation method for computer system and program for activation
JPH05233267A (en) Device incorporating program rom
US5778207A (en) Assisting operating-system interrupts using application-based processing
US20030105783A1 (en) Method and system of file protection

Legal Events

Date Code Title Description
STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

CC Certificate of correction
FPAY Fee payment

Year of fee payment: 8

AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034543/0001

Effective date: 20141014

FPAY Fee payment

Year of fee payment: 12