Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt
new file mode 100644
index 0000000..5d4ae9a
--- /dev/null
+++ b/Documentation/power/devices.txt
@@ -0,0 +1,319 @@
+
+Device Power Management
+
+
+Device power management encompasses two areas - the ability to save
+state and transition a device to a low-power state when the system is
+entering a low-power state; and the ability to transition a device to
+a low-power state while the system is running (and independently of
+any other power management activity).
+
+
+Methods
+
+The methods to suspend and resume devices reside in struct bus_type:
+
+struct bus_type {
+ ...
+ int (*suspend)(struct device * dev, pm_message_t state);
+ int (*resume)(struct device * dev);
+};
+
+Each bus driver is responsible implementing these methods, translating
+the call into a bus-specific request and forwarding the call to the
+bus-specific drivers. For example, PCI drivers implement suspend() and
+resume() methods in struct pci_driver. The PCI core is simply
+responsible for translating the pointers to PCI-specific ones and
+calling the low-level driver.
+
+This is done to a) ease transition to the new power management methods
+and leverage the existing PM code in various bus drivers; b) allow
+buses to implement generic and default PM routines for devices, and c)
+make the flow of execution obvious to the reader.
+
+
+System Power Management
+
+When the system enters a low-power state, the device tree is walked in
+a depth-first fashion to transition each device into a low-power
+state. The ordering of the device tree is guaranteed by the order in
+which devices get registered - children are never registered before
+their ancestors, and devices are placed at the back of the list when
+registered. By walking the list in reverse order, we are guaranteed to
+suspend devices in the proper order.
+
+Devices are suspended once with interrupts enabled. Drivers are
+expected to stop I/O transactions, save device state, and place the
+device into a low-power state. Drivers may sleep, allocate memory,
+etc. at will.
+
+Some devices are broken and will inevitably have problems powering
+down or disabling themselves with interrupts enabled. For these
+special cases, they may return -EAGAIN. This will put the device on a
+list to be taken care of later. When interrupts are disabled, before
+we enter the low-power state, their drivers are called again to put
+their device to sleep.
+
+On resume, the devices that returned -EAGAIN will be called to power
+themselves back on with interrupts disabled. Once interrupts have been
+re-enabled, the rest of the drivers will be called to resume their
+devices. On resume, a driver is responsible for powering back on each
+device, restoring state, and re-enabling I/O transactions for that
+device.
+
+System devices follow a slightly different API, which can be found in
+
+ include/linux/sysdev.h
+ drivers/base/sys.c
+
+System devices will only be suspended with interrupts disabled, and
+after all other devices have been suspended. On resume, they will be
+resumed before any other devices, and also with interrupts disabled.
+
+
+Runtime Power Management
+
+Many devices are able to dynamically power down while the system is
+still running. This feature is useful for devices that are not being
+used, and can offer significant power savings on a running system.
+
+In each device's directory, there is a 'power' directory, which
+contains at least a 'state' file. Reading from this file displays what
+power state the device is currently in. Writing to this file initiates
+a transition to the specified power state, which must be a decimal in
+the range 1-3, inclusive; or 0 for 'On'.
+
+The PM core will call the ->suspend() method in the bus_type object
+that the device belongs to if the specified state is not 0, or
+->resume() if it is.
+
+Nothing will happen if the specified state is the same state the
+device is currently in.
+
+If the device is already in a low-power state, and the specified state
+is another, but different, low-power state, the ->resume() method will
+first be called to power the device back on, then ->suspend() will be
+called again with the new state.
+
+The driver is responsible for saving the working state of the device
+and putting it into the low-power state specified. If this was
+successful, it returns 0, and the device's power_state field is
+updated.
+
+The driver must take care to know whether or not it is able to
+properly resume the device, including all step of reinitialization
+necessary. (This is the hardest part, and the one most protected by
+NDA'd documents).
+
+The driver must also take care not to suspend a device that is
+currently in use. It is their responsibility to provide their own
+exclusion mechanisms.
+
+The runtime power transition happens with interrupts enabled. If a
+device cannot support being powered down with interrupts, it may
+return -EAGAIN (as it would during a system power management
+transition), but it will _not_ be called again, and the transaction
+will fail.
+
+There is currently no way to know what states a device or driver
+supports a priori. This will change in the future.
+
+pm_message_t meaning
+
+pm_message_t has two fields. event ("major"), and flags. If driver
+does not know event code, it aborts the request, returning error. Some
+drivers may need to deal with special cases based on the actual type
+of suspend operation being done at the system level. This is why
+there are flags.
+
+Event codes are:
+
+ON -- no need to do anything except special cases like broken
+HW.
+
+# NOTIFICATION -- pretty much same as ON?
+
+FREEZE -- stop DMA and interrupts, and be prepared to reinit HW from
+scratch. That probably means stop accepting upstream requests, the
+actual policy of what to do with them beeing specific to a given
+driver. It's acceptable for a network driver to just drop packets
+while a block driver is expected to block the queue so no request is
+lost. (Use IDE as an example on how to do that). FREEZE requires no
+power state change, and it's expected for drivers to be able to
+quickly transition back to operating state.
+
+SUSPEND -- like FREEZE, but also put hardware into low-power state. If
+there's need to distinguish several levels of sleep, additional flag
+is probably best way to do that.
+
+Transitions are only from a resumed state to a suspended state, never
+between 2 suspended states. (ON -> FREEZE or ON -> SUSPEND can happen,
+FREEZE -> SUSPEND or SUSPEND -> FREEZE can not).
+
+All events are:
+
+[NOTE NOTE NOTE: If you are driver author, you should not care; you
+should only look at event, and ignore flags.]
+
+#Prepare for suspend -- userland is still running but we are going to
+#enter suspend state. This gives drivers chance to load firmware from
+#disk and store it in memory, or do other activities taht require
+#operating userland, ability to kmalloc GFP_KERNEL, etc... All of these
+#are forbiden once the suspend dance is started.. event = ON, flags =
+#PREPARE_TO_SUSPEND
+
+Apm standby -- prepare for APM event. Quiesce devices to make life
+easier for APM BIOS. event = FREEZE, flags = APM_STANDBY
+
+Apm suspend -- same as APM_STANDBY, but it we should probably avoid
+spinning down disks. event = FREEZE, flags = APM_SUSPEND
+
+System halt, reboot -- quiesce devices to make life easier for BIOS. event
+= FREEZE, flags = SYSTEM_HALT or SYSTEM_REBOOT
+
+System shutdown -- at least disks need to be spun down, or data may be
+lost. Quiesce devices, just to make life easier for BIOS. event =
+FREEZE, flags = SYSTEM_SHUTDOWN
+
+Kexec -- turn off DMAs and put hardware into some state where new
+kernel can take over. event = FREEZE, flags = KEXEC
+
+Powerdown at end of swsusp -- very similar to SYSTEM_SHUTDOWN, except wake
+may need to be enabled on some devices. This actually has at least 3
+subtypes, system can reboot, enter S4 and enter S5 at the end of
+swsusp. event = FREEZE, flags = SWSUSP and one of SYSTEM_REBOOT,
+SYSTEM_SHUTDOWN, SYSTEM_S4
+
+Suspend to ram -- put devices into low power state. event = SUSPEND,
+flags = SUSPEND_TO_RAM
+
+Freeze for swsusp snapshot -- stop DMA and interrupts. No need to put
+devices into low power mode, but you must be able to reinitialize
+device from scratch in resume method. This has two flavors, its done
+once on suspending kernel, once on resuming kernel. event = FREEZE,
+flags = DURING_SUSPEND or DURING_RESUME
+
+Device detach requested from /sys -- deinitialize device; proably same as
+SYSTEM_SHUTDOWN, I do not understand this one too much. probably event
+= FREEZE, flags = DEV_DETACH.
+
+#These are not really events sent:
+#
+#System fully on -- device is working normally; this is probably never
+#passed to suspend() method... event = ON, flags = 0
+#
+#Ready after resume -- userland is now running, again. Time to free any
+#memory you ate during prepare to suspend... event = ON, flags =
+#READY_AFTER_RESUME
+#
+
+Driver Detach Power Management
+
+The kernel now supports the ability to place a device in a low-power
+state when it is detached from its driver, which happens when its
+module is removed.
+
+Each device contains a 'detach_state' file in its sysfs directory
+which can be used to control this state. Reading from this file
+displays what the current detach state is set to. This is 0 (On) by
+default. A user may write a positive integer value to this file in the
+range of 1-4 inclusive.
+
+A value of 1-3 will indicate the device should be placed in that
+low-power state, which will cause ->suspend() to be called for that
+device. A value of 4 indicates that the device should be shutdown, so
+->shutdown() will be called for that device.
+
+The driver is responsible for reinitializing the device when the
+module is re-inserted during it's ->probe() (or equivalent) method.
+The driver core will not call any extra functions when binding the
+device to the driver.
+
+pm_message_t meaning
+
+pm_message_t has two fields. event ("major"), and flags. If driver
+does not know event code, it aborts the request, returning error. Some
+drivers may need to deal with special cases based on the actual type
+of suspend operation being done at the system level. This is why
+there are flags.
+
+Event codes are:
+
+ON -- no need to do anything except special cases like broken
+HW.
+
+# NOTIFICATION -- pretty much same as ON?
+
+FREEZE -- stop DMA and interrupts, and be prepared to reinit HW from
+scratch. That probably means stop accepting upstream requests, the
+actual policy of what to do with them being specific to a given
+driver. It's acceptable for a network driver to just drop packets
+while a block driver is expected to block the queue so no request is
+lost. (Use IDE as an example on how to do that). FREEZE requires no
+power state change, and it's expected for drivers to be able to
+quickly transition back to operating state.
+
+SUSPEND -- like FREEZE, but also put hardware into low-power state. If
+there's need to distinguish several levels of sleep, additional flag
+is probably best way to do that.
+
+Transitions are only from a resumed state to a suspended state, never
+between 2 suspended states. (ON -> FREEZE or ON -> SUSPEND can happen,
+FREEZE -> SUSPEND or SUSPEND -> FREEZE can not).
+
+All events are:
+
+[NOTE NOTE NOTE: If you are driver author, you should not care; you
+should only look at event, and ignore flags.]
+
+#Prepare for suspend -- userland is still running but we are going to
+#enter suspend state. This gives drivers chance to load firmware from
+#disk and store it in memory, or do other activities taht require
+#operating userland, ability to kmalloc GFP_KERNEL, etc... All of these
+#are forbiden once the suspend dance is started.. event = ON, flags =
+#PREPARE_TO_SUSPEND
+
+Apm standby -- prepare for APM event. Quiesce devices to make life
+easier for APM BIOS. event = FREEZE, flags = APM_STANDBY
+
+Apm suspend -- same as APM_STANDBY, but it we should probably avoid
+spinning down disks. event = FREEZE, flags = APM_SUSPEND
+
+System halt, reboot -- quiesce devices to make life easier for BIOS. event
+= FREEZE, flags = SYSTEM_HALT or SYSTEM_REBOOT
+
+System shutdown -- at least disks need to be spun down, or data may be
+lost. Quiesce devices, just to make life easier for BIOS. event =
+FREEZE, flags = SYSTEM_SHUTDOWN
+
+Kexec -- turn off DMAs and put hardware into some state where new
+kernel can take over. event = FREEZE, flags = KEXEC
+
+Powerdown at end of swsusp -- very similar to SYSTEM_SHUTDOWN, except wake
+may need to be enabled on some devices. This actually has at least 3
+subtypes, system can reboot, enter S4 and enter S5 at the end of
+swsusp. event = FREEZE, flags = SWSUSP and one of SYSTEM_REBOOT,
+SYSTEM_SHUTDOWN, SYSTEM_S4
+
+Suspend to ram -- put devices into low power state. event = SUSPEND,
+flags = SUSPEND_TO_RAM
+
+Freeze for swsusp snapshot -- stop DMA and interrupts. No need to put
+devices into low power mode, but you must be able to reinitialize
+device from scratch in resume method. This has two flavors, its done
+once on suspending kernel, once on resuming kernel. event = FREEZE,
+flags = DURING_SUSPEND or DURING_RESUME
+
+Device detach requested from /sys -- deinitialize device; proably same as
+SYSTEM_SHUTDOWN, I do not understand this one too much. probably event
+= FREEZE, flags = DEV_DETACH.
+
+#These are not really events sent:
+#
+#System fully on -- device is working normally; this is probably never
+#passed to suspend() method... event = ON, flags = 0
+#
+#Ready after resume -- userland is now running, again. Time to free any
+#memory you ate during prepare to suspend... event = ON, flags =
+#READY_AFTER_RESUME
+#
diff --git a/Documentation/power/interface.txt b/Documentation/power/interface.txt
new file mode 100644
index 0000000..f5ebda5
--- /dev/null
+++ b/Documentation/power/interface.txt
@@ -0,0 +1,43 @@
+Power Management Interface
+
+
+The power management subsystem provides a unified sysfs interface to
+userspace, regardless of what architecture or platform one is
+running. The interface exists in /sys/power/ directory (assuming sysfs
+is mounted at /sys).
+
+/sys/power/state controls system power state. Reading from this file
+returns what states are supported, which is hard-coded to 'standby'
+(Power-On Suspend), 'mem' (Suspend-to-RAM), and 'disk'
+(Suspend-to-Disk).
+
+Writing to this file one of those strings causes the system to
+transition into that state. Please see the file
+Documentation/power/states.txt for a description of each of those
+states.
+
+
+/sys/power/disk controls the operating mode of the suspend-to-disk
+mechanism. Suspend-to-disk can be handled in several ways. The
+greatest distinction is who writes memory to disk - the firmware or
+the kernel. If the firmware does it, we assume that it also handles
+suspending the system.
+
+If the kernel does it, then we have three options for putting the system
+to sleep - using the platform driver (e.g. ACPI or other PM
+registers), powering off the system or rebooting the system (for
+testing). The system will support either 'firmware' or 'platform', and
+that is known a priori. But, the user may choose 'shutdown' or
+'reboot' as alternatives.
+
+Reading from this file will display what the mode is currently set
+to. Writing to this file will accept one of
+
+ 'firmware'
+ 'platform'
+ 'shutdown'
+ 'reboot'
+
+It will only change to 'firmware' or 'platform' if the system supports
+it.
+
diff --git a/Documentation/power/kernel_threads.txt b/Documentation/power/kernel_threads.txt
new file mode 100644
index 0000000..60b5481
--- /dev/null
+++ b/Documentation/power/kernel_threads.txt
@@ -0,0 +1,41 @@
+KERNEL THREADS
+
+
+Freezer
+
+Upon entering a suspended state the system will freeze all
+tasks. This is done by delivering pseudosignals. This affects
+kernel threads, too. To successfully freeze a kernel thread
+the thread has to check for the pseudosignal and enter the
+refrigerator. Code to do this looks like this:
+
+ do {
+ hub_events();
+ wait_event_interruptible(khubd_wait, !list_empty(&hub_event_list));
+ if (current->flags & PF_FREEZE)
+ refrigerator(PF_FREEZE);
+ } while (!signal_pending(current));
+
+from drivers/usb/core/hub.c::hub_thread()
+
+
+The Unfreezable
+
+Some kernel threads however, must not be frozen. The kernel must
+be able to finish pending IO operations and later on be able to
+write the memory image to disk. Kernel threads needed to do IO
+must stay awake. Such threads must mark themselves unfreezable
+like this:
+
+ /*
+ * This thread doesn't need any user-level access,
+ * so get rid of all our resources.
+ */
+ daemonize("usb-storage");
+
+ current->flags |= PF_NOFREEZE;
+
+from drivers/usb/storage/usb.c::usb_stor_control_thread()
+
+Such drivers are themselves responsible for staying quiet during
+the actual snapshotting.
diff --git a/Documentation/power/pci.txt b/Documentation/power/pci.txt
new file mode 100644
index 0000000..c85428e
--- /dev/null
+++ b/Documentation/power/pci.txt
@@ -0,0 +1,332 @@
+
+PCI Power Management
+~~~~~~~~~~~~~~~~~~~~
+
+An overview of the concepts and the related functions in the Linux kernel
+
+Patrick Mochel <mochel@transmeta.com>
+(and others)
+
+---------------------------------------------------------------------------
+
+1. Overview
+2. How the PCI Subsystem Does Power Management
+3. PCI Utility Functions
+4. PCI Device Drivers
+5. Resources
+
+1. Overview
+~~~~~~~~~~~
+
+The PCI Power Management Specification was introduced between the PCI 2.1 and
+PCI 2.2 Specifications. It a standard interface for controlling various
+power management operations.
+
+Implementation of the PCI PM Spec is optional, as are several sub-components of
+it. If a device supports the PCI PM Spec, the device will have an 8 byte
+capability field in its PCI configuration space. This field is used to describe
+and control the standard PCI power management features.
+
+The PCI PM spec defines 4 operating states for devices (D0 - D3) and for buses
+(B0 - B3). The higher the number, the less power the device consumes. However,
+the higher the number, the longer the latency is for the device to return to
+an operational state (D0).
+
+There are actually two D3 states. When someone talks about D3, they usually
+mean D3hot, which corresponds to an ACPI D2 state (power is reduced, the
+device may lose some context). But they may also mean D3cold, which is an
+ACPI D3 state (power is fully off, all state was discarded); or both.
+
+Bus power management is not covered in this version of this document.
+
+Note that all PCI devices support D0 and D3cold by default, regardless of
+whether or not they implement any of the PCI PM spec.
+
+The possible state transitions that a device can undergo are:
+
++---------------------------+
+| Current State | New State |
++---------------------------+
+| D0 | D1, D2, D3|
++---------------------------+
+| D1 | D2, D3 |
++---------------------------+
+| D2 | D3 |
++---------------------------+
+| D1, D2, D3 | D0 |
++---------------------------+
+
+Note that when the system is entering a global suspend state, all devices will
+be placed into D3 and when resuming, all devices will be placed into D0.
+However, when the system is running, other state transitions are possible.
+
+2. How The PCI Subsystem Handles Power Management
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PCI suspend/resume functionality is accessed indirectly via the Power
+Management subsystem. At boot, the PCI driver registers a power management
+callback with that layer. Upon entering a suspend state, the PM layer iterates
+through all of its registered callbacks. This currently takes place only during
+APM state transitions.
+
+Upon going to sleep, the PCI subsystem walks its device tree twice. Both times,
+it does a depth first walk of the device tree. The first walk saves each of the
+device's state and checks for devices that will prevent the system from entering
+a global power state. The next walk then places the devices in a low power
+state.
+
+The first walk allows a graceful recovery in the event of a failure, since none
+of the devices have actually been powered down.
+
+In both walks, in particular the second, all children of a bridge are touched
+before the actual bridge itself. This allows the bridge to retain power while
+its children are being accessed.
+
+Upon resuming from sleep, just the opposite must be true: all bridges must be
+powered on and restored before their children are powered on. This is easily
+accomplished with a breadth-first walk of the PCI device tree.
+
+
+3. PCI Utility Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+These are helper functions designed to be called by individual device drivers.
+Assuming that a device behaves as advertised, these should be applicable in most
+cases. However, results may vary.
+
+Note that these functions are never implicitly called for the driver. The driver
+is always responsible for deciding when and if to call these.
+
+
+pci_save_state
+--------------
+
+Usage:
+ pci_save_state(dev, buffer);
+
+Description:
+ Save first 64 bytes of PCI config space. Buffer must be allocated by
+ caller.
+
+
+pci_restore_state
+-----------------
+
+Usage:
+ pci_restore_state(dev, buffer);
+
+Description:
+ Restore previously saved config space. (First 64 bytes only);
+
+ If buffer is NULL, then restore what information we know about the
+ device from bootup: BARs and interrupt line.
+
+
+pci_set_power_state
+-------------------
+
+Usage:
+ pci_set_power_state(dev, state);
+
+Description:
+ Transition device to low power state using PCI PM Capabilities
+ registers.
+
+ Will fail under one of the following conditions:
+ - If state is less than current state, but not D0 (illegal transition)
+ - Device doesn't support PM Capabilities
+ - Device does not support requested state
+
+
+pci_enable_wake
+---------------
+
+Usage:
+ pci_enable_wake(dev, state, enable);
+
+Description:
+ Enable device to generate PME# during low power state using PCI PM
+ Capabilities.
+
+ Checks whether if device supports generating PME# from requested state
+ and fail if it does not, unless enable == 0 (request is to disable wake
+ events, which is implicit if it doesn't even support it in the first
+ place).
+
+ Note that the PMC Register in the device's PM Capabilties has a bitmask
+ of the states it supports generating PME# from. D3hot is bit 3 and
+ D3cold is bit 4. So, while a value of 4 as the state may not seem
+ semantically correct, it is.
+
+
+4. PCI Device Drivers
+~~~~~~~~~~~~~~~~~~~~~
+
+These functions are intended for use by individual drivers, and are defined in
+struct pci_driver:
+
+ int (*save_state) (struct pci_dev *dev, u32 state);
+ int (*suspend) (struct pci_dev *dev, u32 state);
+ int (*resume) (struct pci_dev *dev);
+ int (*enable_wake) (struct pci_dev *dev, u32 state, int enable);
+
+
+save_state
+----------
+
+Usage:
+
+if (dev->driver && dev->driver->save_state)
+ dev->driver->save_state(dev,state);
+
+The driver should use this callback to save device state. It should take into
+account the current state of the device and the requested state in order to
+avoid any unnecessary operations.
+
+For example, a video card that supports all 4 states (D0-D3), all controller
+context is preserved when entering D1, but the screen is placed into a low power
+state (blanked).
+
+The driver can also interpret this function as a notification that it may be
+entering a sleep state in the near future. If it knows that the device cannot
+enter the requested state, either because of lack of support for it, or because
+the device is middle of some critical operation, then it should fail.
+
+This function should not be used to set any state in the device or the driver
+because the device may not actually enter the sleep state (e.g. another driver
+later causes causes a global state transition to fail).
+
+Note that in intermediate low power states, a device's I/O and memory spaces may
+be disabled and may not be available in subsequent transitions to lower power
+states.
+
+
+suspend
+-------
+
+Usage:
+
+if (dev->driver && dev->driver->suspend)
+ dev->driver->suspend(dev,state);
+
+A driver uses this function to actually transition the device into a low power
+state. This should include disabling I/O, IRQs, and bus-mastering, as well as
+physically transitioning the device to a lower power state; it may also include
+calls to pci_enable_wake().
+
+Bus mastering may be disabled by doing:
+
+pci_disable_device(dev);
+
+For devices that support the PCI PM Spec, this may be used to set the device's
+power state to match the suspend() parameter:
+
+pci_set_power_state(dev,state);
+
+The driver is also responsible for disabling any other device-specific features
+(e.g blanking screen, turning off on-card memory, etc).
+
+The driver should be sure to track the current state of the device, as it may
+obviate the need for some operations.
+
+The driver should update the current_state field in its pci_dev structure in
+this function, except for PM-capable devices when pci_set_power_state is used.
+
+resume
+------
+
+Usage:
+
+if (dev->driver && dev->driver->suspend)
+ dev->driver->resume(dev)
+
+The resume callback may be called from any power state, and is always meant to
+transition the device to the D0 state.
+
+The driver is responsible for reenabling any features of the device that had
+been disabled during previous suspend calls, such as IRQs and bus mastering,
+as well as calling pci_restore_state().
+
+If the device is currently in D3, it may need to be reinitialized in resume().
+
+ * Some types of devices, like bus controllers, will preserve context in D3hot
+ (using Vcc power). Their drivers will often want to avoid re-initializing
+ them after re-entering D0 (perhaps to avoid resetting downstream devices).
+
+ * Other kinds of devices in D3hot will discard device context as part of a
+ soft reset when re-entering the D0 state.
+
+ * Devices resuming from D3cold always go through a power-on reset. Some
+ device context can also be preserved using Vaux power.
+
+ * Some systems hide D3cold resume paths from drivers. For example, on PCs
+ the resume path for suspend-to-disk often runs BIOS powerup code, which
+ will sometimes re-initialize the device.
+
+To handle resets during D3 to D0 transitions, it may be convenient to share
+device initialization code between probe() and resume(). Device parameters
+can also be saved before the driver suspends into D3, avoiding re-probe.
+
+If the device supports the PCI PM Spec, it can use this to physically transition
+the device to D0:
+
+pci_set_power_state(dev,0);
+
+Note that if the entire system is transitioning out of a global sleep state, all
+devices will be placed in the D0 state, so this is not necessary. However, in
+the event that the device is placed in the D3 state during normal operation,
+this call is necessary. It is impossible to determine which of the two events is
+taking place in the driver, so it is always a good idea to make that call.
+
+The driver should take note of the state that it is resuming from in order to
+ensure correct (and speedy) operation.
+
+The driver should update the current_state field in its pci_dev structure in
+this function, except for PM-capable devices when pci_set_power_state is used.
+
+
+enable_wake
+-----------
+
+Usage:
+
+if (dev->driver && dev->driver->enable_wake)
+ dev->driver->enable_wake(dev,state,enable);
+
+This callback is generally only relevant for devices that support the PCI PM
+spec and have the ability to generate a PME# (Power Management Event Signal)
+to wake the system up. (However, it is possible that a device may support
+some non-standard way of generating a wake event on sleep.)
+
+Bits 15:11 of the PMC (Power Mgmt Capabilities) Register in a device's
+PM Capabilties describe what power states the device supports generating a
+wake event from:
+
++------------------+
+| Bit | State |
++------------------+
+| 11 | D0 |
+| 12 | D1 |
+| 13 | D2 |
+| 14 | D3hot |
+| 15 | D3cold |
++------------------+
+
+A device can use this to enable wake events:
+
+ pci_enable_wake(dev,state,enable);
+
+Note that to enable PME# from D3cold, a value of 4 should be passed to
+pci_enable_wake (since it uses an index into a bitmask). If a driver gets
+a request to enable wake events from D3, two calls should be made to
+pci_enable_wake (one for both D3hot and D3cold).
+
+
+5. Resources
+~~~~~~~~~~~~
+
+PCI Local Bus Specification
+PCI Bus Power Management Interface Specification
+
+ http://pcisig.org
+
diff --git a/Documentation/power/states.txt b/Documentation/power/states.txt
new file mode 100644
index 0000000..3e5e5d3
--- /dev/null
+++ b/Documentation/power/states.txt
@@ -0,0 +1,79 @@
+
+System Power Management States
+
+
+The kernel supports three power management states generically, though
+each is dependent on platform support code to implement the low-level
+details for each state. This file describes each state, what they are
+commonly called, what ACPI state they map to, and what string to write
+to /sys/power/state to enter that state
+
+
+State: Standby / Power-On Suspend
+ACPI State: S1
+String: "standby"
+
+This state offers minimal, though real, power savings, while providing
+a very low-latency transition back to a working system. No operating
+state is lost (the CPU retains power), so the system easily starts up
+again where it left off.
+
+We try to put devices in a low-power state equivalent to D1, which
+also offers low power savings, but low resume latency. Not all devices
+support D1, and those that don't are left on.
+
+A transition from Standby to the On state should take about 1-2
+seconds.
+
+
+State: Suspend-to-RAM
+ACPI State: S3
+String: "mem"
+
+This state offers significant power savings as everything in the
+system is put into a low-power state, except for memory, which is
+placed in self-refresh mode to retain its contents.
+
+System and device state is saved and kept in memory. All devices are
+suspended and put into D3. In many cases, all peripheral buses lose
+power when entering STR, so devices must be able to handle the
+transition back to the On state.
+
+For at least ACPI, STR requires some minimal boot-strapping code to
+resume the system from STR. This may be true on other platforms.
+
+A transition from Suspend-to-RAM to the On state should take about
+3-5 seconds.
+
+
+State: Suspend-to-disk
+ACPI State: S4
+String: "disk"
+
+This state offers the greatest power savings, and can be used even in
+the absence of low-level platform support for power management. This
+state operates similarly to Suspend-to-RAM, but includes a final step
+of writing memory contents to disk. On resume, this is read and memory
+is restored to its pre-suspend state.
+
+STD can be handled by the firmware or the kernel. If it is handled by
+the firmware, it usually requires a dedicated partition that must be
+setup via another operating system for it to use. Despite the
+inconvenience, this method requires minimal work by the kernel, since
+the firmware will also handle restoring memory contents on resume.
+
+If the kernel is responsible for persistantly saving state, a mechanism
+called 'swsusp' (Swap Suspend) is used to write memory contents to
+free swap space. swsusp has some restrictive requirements, but should
+work in most cases. Some, albeit outdated, documentation can be found
+in Documentation/power/swsusp.txt.
+
+Once memory state is written to disk, the system may either enter a
+low-power state (like ACPI S4), or it may simply power down. Powering
+down offers greater savings, and allows this mechanism to work on any
+system. However, entering a real low-power state allows the user to
+trigger wake up events (e.g. pressing a key or opening a laptop lid).
+
+A transition from Suspend-to-Disk to the On state should take about 30
+seconds, though it's typically a bit more with the current
+implementation.
diff --git a/Documentation/power/swsusp.txt b/Documentation/power/swsusp.txt
new file mode 100644
index 0000000..c7c3459
--- /dev/null
+++ b/Documentation/power/swsusp.txt
@@ -0,0 +1,235 @@
+From kernel/suspend.c:
+
+ * BIG FAT WARNING *********************************************************
+ *
+ * If you have unsupported (*) devices using DMA...
+ * ...say goodbye to your data.
+ *
+ * If you touch anything on disk between suspend and resume...
+ * ...kiss your data goodbye.
+ *
+ * If your disk driver does not support suspend... (IDE does)
+ * ...you'd better find out how to get along
+ * without your data.
+ *
+ * If you change kernel command line between suspend and resume...
+ * ...prepare for nasty fsck or worse.
+ *
+ * If you change your hardware while system is suspended...
+ * ...well, it was not good idea.
+ *
+ * (*) suspend/resume support is needed to make it safe.
+
+You need to append resume=/dev/your_swap_partition to kernel command
+line. Then you suspend by
+
+echo shutdown > /sys/power/disk; echo disk > /sys/power/state
+
+. If you feel ACPI works pretty well on your system, you might try
+
+echo platform > /sys/power/disk; echo disk > /sys/power/state
+
+
+
+Article about goals and implementation of Software Suspend for Linux
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Author: Gábor Kuti
+Last revised: 2003-10-20 by Pavel Machek
+
+Idea and goals to achieve
+
+Nowadays it is common in several laptops that they have a suspend button. It
+saves the state of the machine to a filesystem or to a partition and switches
+to standby mode. Later resuming the machine the saved state is loaded back to
+ram and the machine can continue its work. It has two real benefits. First we
+save ourselves the time machine goes down and later boots up, energy costs
+are real high when running from batteries. The other gain is that we don't have to
+interrupt our programs so processes that are calculating something for a long
+time shouldn't need to be written interruptible.
+
+swsusp saves the state of the machine into active swaps and then reboots or
+powerdowns. You must explicitly specify the swap partition to resume from with
+``resume='' kernel option. If signature is found it loads and restores saved
+state. If the option ``noresume'' is specified as a boot parameter, it skips
+the resuming.
+
+In the meantime while the system is suspended you should not add/remove any
+of the hardware, write to the filesystems, etc.
+
+Sleep states summary
+====================
+
+There are three different interfaces you can use, /proc/acpi should
+work like this:
+
+In a really perfect world:
+echo 1 > /proc/acpi/sleep # for standby
+echo 2 > /proc/acpi/sleep # for suspend to ram
+echo 3 > /proc/acpi/sleep # for suspend to ram, but with more power conservative
+echo 4 > /proc/acpi/sleep # for suspend to disk
+echo 5 > /proc/acpi/sleep # for shutdown unfriendly the system
+
+and perhaps
+echo 4b > /proc/acpi/sleep # for suspend to disk via s4bios
+
+Frequently Asked Questions
+==========================
+
+Q: well, suspending a server is IMHO a really stupid thing,
+but... (Diego Zuccato):
+
+A: You bought new UPS for your server. How do you install it without
+bringing machine down? Suspend to disk, rearrange power cables,
+resume.
+
+You have your server on UPS. Power died, and UPS is indicating 30
+seconds to failure. What do you do? Suspend to disk.
+
+Ethernet card in your server died. You want to replace it. Your
+server is not hotplug capable. What do you do? Suspend to disk,
+replace ethernet card, resume. If you are fast your users will not
+even see broken connections.
+
+
+Q: Maybe I'm missing something, but why don't the regular I/O paths work?
+
+A: We do use the regular I/O paths. However we cannot restore the data
+to its original location as we load it. That would create an
+inconsistent kernel state which would certainly result in an oops.
+Instead, we load the image into unused memory and then atomically copy
+it back to it original location. This implies, of course, a maximum
+image size of half the amount of memory.
+
+There are two solutions to this:
+
+* require half of memory to be free during suspend. That way you can
+read "new" data onto free spots, then cli and copy
+
+* assume we had special "polling" ide driver that only uses memory
+between 0-640KB. That way, I'd have to make sure that 0-640KB is free
+during suspending, but otherwise it would work...
+
+suspend2 shares this fundamental limitation, but does not include user
+data and disk caches into "used memory" by saving them in
+advance. That means that the limitation goes away in practice.
+
+Q: Does linux support ACPI S4?
+
+A: Yes. That's what echo platform > /sys/power/disk does.
+
+Q: My machine doesn't work with ACPI. How can I use swsusp than ?
+
+A: Do a reboot() syscall with right parameters. Warning: glibc gets in
+its way, so check with strace:
+
+reboot(LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2, 0xd000fce2)
+
+(Thanks to Peter Osterlund:)
+
+#include <unistd.h>
+#include <syscall.h>
+
+#define LINUX_REBOOT_MAGIC1 0xfee1dead
+#define LINUX_REBOOT_MAGIC2 672274793
+#define LINUX_REBOOT_CMD_SW_SUSPEND 0xD000FCE2
+
+int main()
+{
+ syscall(SYS_reboot, LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2,
+ LINUX_REBOOT_CMD_SW_SUSPEND, 0);
+ return 0;
+}
+
+Also /sys/ interface should be still present.
+
+Q: What is 'suspend2'?
+
+A: suspend2 is 'Software Suspend 2', a forked implementation of
+suspend-to-disk which is available as separate patches for 2.4 and 2.6
+kernels from swsusp.sourceforge.net. It includes support for SMP, 4GB
+highmem and preemption. It also has a extensible architecture that
+allows for arbitrary transformations on the image (compression,
+encryption) and arbitrary backends for writing the image (eg to swap
+or an NFS share[Work In Progress]). Questions regarding suspend2
+should be sent to the mailing list available through the suspend2
+website, and not to the Linux Kernel Mailing List. We are working
+toward merging suspend2 into the mainline kernel.
+
+Q: A kernel thread must voluntarily freeze itself (call 'refrigerator').
+I found some kernel threads that don't do it, and they don't freeze
+so the system can't sleep. Is this a known behavior?
+
+A: All such kernel threads need to be fixed, one by one. Select the
+place where the thread is safe to be frozen (no kernel semaphores
+should be held at that point and it must be safe to sleep there), and
+add:
+
+ if (current->flags & PF_FREEZE)
+ refrigerator(PF_FREEZE);
+
+If the thread is needed for writing the image to storage, you should
+instead set the PF_NOFREEZE process flag when creating the thread.
+
+
+Q: What is the difference between between "platform", "shutdown" and
+"firmware" in /sys/power/disk?
+
+A:
+
+shutdown: save state in linux, then tell bios to powerdown
+
+platform: save state in linux, then tell bios to powerdown and blink
+ "suspended led"
+
+firmware: tell bios to save state itself [needs BIOS-specific suspend
+ partition, and has very little to do with swsusp]
+
+"platform" is actually right thing to do, but "shutdown" is most
+reliable.
+
+Q: I do not understand why you have such strong objections to idea of
+selective suspend.
+
+A: Do selective suspend during runtime power managment, that's okay. But
+its useless for suspend-to-disk. (And I do not see how you could use
+it for suspend-to-ram, I hope you do not want that).
+
+Lets see, so you suggest to
+
+* SUSPEND all but swap device and parents
+* Snapshot
+* Write image to disk
+* SUSPEND swap device and parents
+* Powerdown
+
+Oh no, that does not work, if swap device or its parents uses DMA,
+you've corrupted data. You'd have to do
+
+* SUSPEND all but swap device and parents
+* FREEZE swap device and parents
+* Snapshot
+* UNFREEZE swap device and parents
+* Write
+* SUSPEND swap device and parents
+
+Which means that you still need that FREEZE state, and you get more
+complicated code. (And I have not yet introduce details like system
+devices).
+
+Q: There don't seem to be any generally useful behavioral
+distinctions between SUSPEND and FREEZE.
+
+A: Doing SUSPEND when you are asked to do FREEZE is always correct,
+but it may be unneccessarily slow. If you want USB to stay simple,
+slowness may not matter to you. It can always be fixed later.
+
+For devices like disk it does matter, you do not want to spindown for
+FREEZE.
+
+Q: After resuming, system is paging heavilly, leading to very bad interactivity.
+
+A: Try running
+
+cat `cat /proc/[0-9]*/maps | grep / | sed 's:.* /:/:' | sort -u` > /dev/null
+
+after resume. swapoff -a; swapon -a may also be usefull.
diff --git a/Documentation/power/tricks.txt b/Documentation/power/tricks.txt
new file mode 100644
index 0000000..c6d58d3
--- /dev/null
+++ b/Documentation/power/tricks.txt
@@ -0,0 +1,27 @@
+ swsusp/S3 tricks
+ ~~~~~~~~~~~~~~~~
+Pavel Machek <pavel@suse.cz>
+
+If you want to trick swsusp/S3 into working, you might want to try:
+
+* go with minimal config, turn off drivers like USB, AGP you don't
+ really need
+
+* turn off APIC and preempt
+
+* use ext2. At least it has working fsck. [If something seemes to go
+ wrong, force fsck when you have a chance]
+
+* turn off modules
+
+* use vga text console, shut down X. [If you really want X, you might
+ want to try vesafb later]
+
+* try running as few processes as possible, preferably go to single
+ user mode.
+
+* due to video issues, swsusp should be easier to get working than
+ S3. Try that first.
+
+When you make it work, try to find out what exactly was it that broke
+suspend, and preferably fix that.
diff --git a/Documentation/power/video.txt b/Documentation/power/video.txt
new file mode 100644
index 0000000..8686968
--- /dev/null
+++ b/Documentation/power/video.txt
@@ -0,0 +1,169 @@
+
+ Video issues with S3 resume
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ 2003-2005, Pavel Machek
+
+During S3 resume, hardware needs to be reinitialized. For most
+devices, this is easy, and kernel driver knows how to do
+it. Unfortunately there's one exception: video card. Those are usually
+initialized by BIOS, and kernel does not have enough information to
+boot video card. (Kernel usually does not even contain video card
+driver -- vesafb and vgacon are widely used).
+
+This is not problem for swsusp, because during swsusp resume, BIOS is
+run normally so video card is normally initialized. S3 has absolutely
+no chance of working with SMP/HT. Be sure it to turn it off before
+testing (swsusp should work ok, OTOH).
+
+There are a few types of systems where video works after S3 resume:
+
+(1) systems where video state is preserved over S3.
+
+(2) systems where it is possible to call the video BIOS during S3
+ resume. Unfortunately, it is not correct to call the video BIOS at
+ that point, but it happens to work on some machines. Use
+ acpi_sleep=s3_bios.
+
+(3) systems that initialize video card into vga text mode and where
+ the BIOS works well enough to be able to set video mode. Use
+ acpi_sleep=s3_mode on these.
+
+(4) on some systems s3_bios kicks video into text mode, and
+ acpi_sleep=s3_bios,s3_mode is needed.
+
+(5) radeon systems, where X can soft-boot your video card. You'll need
+ new enough X, and plain text console (no vesafb or radeonfb), see
+ http://www.doesi.gmxhome.de/linux/tm800s3/s3.html. Actually you
+ should probably use vbetool (6) instead.
+
+(6) other radeon systems, where vbetool is enough to bring system back
+ to life. It needs text console to be working. Do vbetool vbestate
+ save > /tmp/delme; echo 3 > /proc/acpi/sleep; vbetool post; vbetool
+ vbestate restore < /tmp/delme; setfont <whatever>, and your video
+ should work.
+
+(7) on some systems, it is possible to boot most of kernel, and then
+ POSTing bios works. Ole Rohne has patch to do just that at
+ http://dev.gentoo.org/~marineam/patch-radeonfb-2.6.11-rc2-mm2.
+
+Now, if you pass acpi_sleep=something, and it does not work with your
+bios, you'll get a hard crash during resume. Be careful. Also it is
+safest to do your experiments with plain old VGA console. The vesafb
+and radeonfb (etc) drivers have a tendency to crash the machine during
+resume.
+
+You may have a system where none of above works. At that point you
+either invent another ugly hack that works, or write proper driver for
+your video card (good luck getting docs :-(). Maybe suspending from X
+(proper X, knowing your hardware, not XF68_FBcon) might have better
+chance of working.
+
+Table of known working systems:
+
+Model hack (or "how to do it")
+------------------------------------------------------------------------------
+Acer Aspire 1406LC ole's late BIOS init (7), turn off DRI
+Acer TM 242FX vbetool (6)
+Acer TM C300 vga=normal (only suspend on console, not in X), vbetool (6)
+Acer TM 4052LCi s3_bios (2)
+Acer TM 636Lci s3_bios vga=normal (2)
+Acer TM 650 (Radeon M7) vga=normal plus boot-radeon (5) gets text console back
+Acer TM 660 ??? (*)
+Acer TM 800 vga=normal, X patches, see webpage (5) or vbetool (6)
+Acer TM 803 vga=normal, X patches, see webpage (5) or vbetool (6)
+Acer TM 803LCi vga=normal, vbetool (6)
+Arima W730a vbetool needed (6)
+Asus L2400D s3_mode (3)(***) (S1 also works OK)
+Asus L3800C (Radeon M7) s3_bios (2) (S1 also works OK)
+Asus M6NE ??? (*)
+Athlon64 desktop prototype s3_bios (2)
+Compal CL-50 ??? (*)
+Compaq Armada E500 - P3-700 none (1) (S1 also works OK)
+Compaq Evo N620c vga=normal, s3_bios (2)
+Dell 600m, ATI R250 Lf none (1), but needs xorg-x11-6.8.1.902-1
+Dell D600, ATI RV250 vga=normal and X, or try vbestate (6)
+Dell Inspiron 4000 ??? (*)
+Dell Inspiron 500m ??? (*)
+Dell Inspiron 600m ??? (*)
+Dell Inspiron 8200 ??? (*)
+Dell Inspiron 8500 ??? (*)
+Dell Inspiron 8600 ??? (*)
+eMachines athlon64 machines vbetool needed (6) (someone please get me model #s)
+HP NC6000 s3_bios, may not use radeonfb (2); or vbetool (6)
+HP NX7000 ??? (*)
+HP Pavilion ZD7000 vbetool post needed, need open-source nv driver for X
+HP Omnibook XE3 athlon version none (1)
+HP Omnibook XE3GC none (1), video is S3 Savage/IX-MV
+IBM TP T20, model 2647-44G none (1), video is S3 Inc. 86C270-294 Savage/IX-MV, vesafb gets "interesting" but X work.
+IBM TP A31 / Type 2652-M5G s3_mode (3) [works ok with BIOS 1.04 2002-08-23, but not at all with BIOS 1.11 2004-11-05 :-(]
+IBM TP R32 / Type 2658-MMG none (1)
+IBM TP R40 2722B3G ??? (*)
+IBM TP R50p / Type 1832-22U s3_bios (2)
+IBM TP R51 ??? (*)
+IBM TP T30 236681A ??? (*)
+IBM TP T40 / Type 2373-MU4 none (1)
+IBM TP T40p none (1)
+IBM TP R40p s3_bios (2)
+IBM TP T41p s3_bios (2), switch to X after resume
+IBM TP T42 ??? (*)
+IBM ThinkPad T42p (2373-GTG) s3_bios (2)
+IBM TP X20 ??? (*)
+IBM TP X30 ??? (*)
+IBM TP X31 / Type 2672-XXH none (1), use radeontool (http://fdd.com/software/radeon/) to turn off backlight.
+IBM Thinkpad X40 Type 2371-7JG s3_bios,s3_mode (4)
+Medion MD4220 ??? (*)
+Samsung P35 vbetool needed (6)
+Sharp PC-AR10 (ATI rage) none (1)
+Sony Vaio PCG-F403 ??? (*)
+Sony Vaio PCG-N505SN ??? (*)
+Sony Vaio vgn-s260 X or boot-radeon can init it (5)
+Toshiba Libretto L5 none (1)
+Toshiba Satellite 4030CDT s3_mode (3)
+Toshiba Satellite 4080XCDT s3_mode (3)
+Toshiba Satellite 4090XCDT ??? (*)
+Toshiba Satellite P10-554 s3_bios,s3_mode (4)(****)
+Uniwill 244IIO ??? (*)
+
+
+(*) from http://www.ubuntulinux.org/wiki/HoaryPMResults, not sure
+ which options to use. If you know, please tell me.
+
+(***) To be tested with a newer kernel.
+
+(****) Not with SMP kernel, UP only.
+
+VBEtool details
+~~~~~~~~~~~~~~~
+(with thanks to Carl-Daniel Hailfinger)
+
+First, boot into X and run the following script ONCE:
+#!/bin/bash
+statedir=/root/s3/state
+mkdir -p $statedir
+chvt 2
+sleep 1
+vbetool vbestate save >$statedir/vbe
+
+
+To suspend and resume properly, call the following script as root:
+#!/bin/bash
+statedir=/root/s3/state
+curcons=`fgconsole`
+fuser /dev/tty$curcons 2>/dev/null|xargs ps -o comm= -p|grep -q X && chvt 2
+cat /dev/vcsa >$statedir/vcsa
+sync
+echo 3 >/proc/acpi/sleep
+sync
+vbetool post
+vbetool vbestate restore <$statedir/vbe
+cat $statedir/vcsa >/dev/vcsa
+rckbd restart
+chvt $[curcons%6+1]
+chvt $curcons
+
+
+Unless you change your graphics card or other hardware configuration,
+the state once saved will be OK for every resume afterwards.
+NOTE: The "rckbd restart" command may be different for your
+distribution. Simply replace it with the command you would use to
+set the fonts on screen.
diff --git a/Documentation/power/video_extension.txt b/Documentation/power/video_extension.txt
new file mode 100644
index 0000000..8e33d7c8
--- /dev/null
+++ b/Documentation/power/video_extension.txt
@@ -0,0 +1,34 @@
+This driver implement the ACPI Extensions For Display Adapters
+for integrated graphics devices on motherboard, as specified in
+ACPI 2.0 Specification, Appendix B, allowing to perform some basic
+control like defining the video POST device, retrieving EDID information
+or to setup a video output, etc. Note that this is an ref. implementation only.
+It may or may not work for your integrated video device.
+
+Interfaces exposed to userland through /proc/acpi/video:
+
+VGA/info : display the supported video bus device capability like ,Video ROM, CRT/LCD/TV.
+VGA/ROM : Used to get a copy of the display devices' ROM data (up to 4k).
+VGA/POST_info : Used to determine what options are implemented.
+VGA/POST : Used to get/set POST device.
+VGA/DOS : Used to get/set ownership of output switching:
+ Please refer ACPI spec B.4.1 _DOS
+VGA/CRT : CRT output
+VGA/LCD : LCD output
+VGA/TV : TV output
+VGA/*/brightness : Used to get/set brightness of output device
+
+Notify event through /proc/acpi/event:
+
+#define ACPI_VIDEO_NOTIFY_SWITCH 0x80
+#define ACPI_VIDEO_NOTIFY_PROBE 0x81
+#define ACPI_VIDEO_NOTIFY_CYCLE 0x82
+#define ACPI_VIDEO_NOTIFY_NEXT_OUTPUT 0x83
+#define ACPI_VIDEO_NOTIFY_PREV_OUTPUT 0x84
+
+#define ACPI_VIDEO_NOTIFY_CYCLE_BRIGHTNESS 0x82
+#define ACPI_VIDEO_NOTIFY_INC_BRIGHTNESS 0x83
+#define ACPI_VIDEO_NOTIFY_DEC_BRIGHTNESS 0x84
+#define ACPI_VIDEO_NOTIFY_ZERO_BRIGHTNESS 0x85
+#define ACPI_VIDEO_NOTIFY_DISPLAY_OFF 0x86
+