| =========================== |
| Linux for S/390 and zSeries |
| =========================== |
| |
| Common Device Support (CDS) |
| Device Driver I/O Support Routines |
| |
| Authors: |
| - Ingo Adlung |
| - Cornelia Huck |
| |
| Copyright, IBM Corp. 1999-2002 |
| |
| Introduction |
| ============ |
| |
| This document describes the common device support routines for Linux/390. |
| Different than other hardware architectures, ESA/390 has defined a unified |
| I/O access method. This gives relief to the device drivers as they don't |
| have to deal with different bus types, polling versus interrupt |
| processing, shared versus non-shared interrupt processing, DMA versus port |
| I/O (PIO), and other hardware features more. However, this implies that |
| either every single device driver needs to implement the hardware I/O |
| attachment functionality itself, or the operating system provides for a |
| unified method to access the hardware, providing all the functionality that |
| every single device driver would have to provide itself. |
| |
| The document does not intend to explain the ESA/390 hardware architecture in |
| every detail.This information can be obtained from the ESA/390 Principles of |
| Operation manual (IBM Form. No. SA22-7201). |
| |
| In order to build common device support for ESA/390 I/O interfaces, a |
| functional layer was introduced that provides generic I/O access methods to |
| the hardware. |
| |
| The common device support layer comprises the I/O support routines defined |
| below. Some of them implement common Linux device driver interfaces, while |
| some of them are ESA/390 platform specific. |
| |
| Note: |
| In order to write a driver for S/390, you also need to look into the interface |
| described in Documentation/s390/driver-model.rst. |
| |
| Note for porting drivers from 2.4: |
| |
| The major changes are: |
| |
| * The functions use a ccw_device instead of an irq (subchannel). |
| * All drivers must define a ccw_driver (see driver-model.txt) and the associated |
| functions. |
| * request_irq() and free_irq() are no longer done by the driver. |
| * The oper_handler is (kindof) replaced by the probe() and set_online() functions |
| of the ccw_driver. |
| * The not_oper_handler is (kindof) replaced by the remove() and set_offline() |
| functions of the ccw_driver. |
| * The channel device layer is gone. |
| * The interrupt handlers must be adapted to use a ccw_device as argument. |
| Moreover, they don't return a devstat, but an irb. |
| * Before initiating an io, the options must be set via ccw_device_set_options(). |
| * Instead of calling read_dev_chars()/read_conf_data(), the driver issues |
| the channel program and handles the interrupt itself. |
| |
| ccw_device_get_ciw() |
| get commands from extended sense data. |
| |
| ccw_device_start(), ccw_device_start_timeout(), ccw_device_start_key(), ccw_device_start_key_timeout() |
| initiate an I/O request. |
| |
| ccw_device_resume() |
| resume channel program execution. |
| |
| ccw_device_halt() |
| terminate the current I/O request processed on the device. |
| |
| do_IRQ() |
| generic interrupt routine. This function is called by the interrupt entry |
| routine whenever an I/O interrupt is presented to the system. The do_IRQ() |
| routine determines the interrupt status and calls the device specific |
| interrupt handler according to the rules (flags) defined during I/O request |
| initiation with do_IO(). |
| |
| The next chapters describe the functions other than do_IRQ() in more details. |
| The do_IRQ() interface is not described, as it is called from the Linux/390 |
| first level interrupt handler only and does not comprise a device driver |
| callable interface. Instead, the functional description of do_IO() also |
| describes the input to the device specific interrupt handler. |
| |
| Note: |
| All explanations apply also to the 64 bit architecture s390x. |
| |
| |
| Common Device Support (CDS) for Linux/390 Device Drivers |
| ======================================================== |
| |
| General Information |
| ------------------- |
| |
| The following chapters describe the I/O related interface routines the |
| Linux/390 common device support (CDS) provides to allow for device specific |
| driver implementations on the IBM ESA/390 hardware platform. Those interfaces |
| intend to provide the functionality required by every device driver |
| implementation to allow to drive a specific hardware device on the ESA/390 |
| platform. Some of the interface routines are specific to Linux/390 and some |
| of them can be found on other Linux platforms implementations too. |
| Miscellaneous function prototypes, data declarations, and macro definitions |
| can be found in the architecture specific C header file |
| linux/arch/s390/include/asm/irq.h. |
| |
| Overview of CDS interface concepts |
| ---------------------------------- |
| |
| Different to other hardware platforms, the ESA/390 architecture doesn't define |
| interrupt lines managed by a specific interrupt controller and bus systems |
| that may or may not allow for shared interrupts, DMA processing, etc.. Instead, |
| the ESA/390 architecture has implemented a so called channel subsystem, that |
| provides a unified view of the devices physically attached to the systems. |
| Though the ESA/390 hardware platform knows about a huge variety of different |
| peripheral attachments like disk devices (aka. DASDs), tapes, communication |
| controllers, etc. they can all be accessed by a well defined access method and |
| they are presenting I/O completion a unified way : I/O interruptions. Every |
| single device is uniquely identified to the system by a so called subchannel, |
| where the ESA/390 architecture allows for 64k devices be attached. |
| |
| Linux, however, was first built on the Intel PC architecture, with its two |
| cascaded 8259 programmable interrupt controllers (PICs), that allow for a |
| maximum of 15 different interrupt lines. All devices attached to such a system |
| share those 15 interrupt levels. Devices attached to the ISA bus system must |
| not share interrupt levels (aka. IRQs), as the ISA bus bases on edge triggered |
| interrupts. MCA, EISA, PCI and other bus systems base on level triggered |
| interrupts, and therewith allow for shared IRQs. However, if multiple devices |
| present their hardware status by the same (shared) IRQ, the operating system |
| has to call every single device driver registered on this IRQ in order to |
| determine the device driver owning the device that raised the interrupt. |
| |
| Up to kernel 2.4, Linux/390 used to provide interfaces via the IRQ (subchannel). |
| For internal use of the common I/O layer, these are still there. However, |
| device drivers should use the new calling interface via the ccw_device only. |
| |
| During its startup the Linux/390 system checks for peripheral devices. Each |
| of those devices is uniquely defined by a so called subchannel by the ESA/390 |
| channel subsystem. While the subchannel numbers are system generated, each |
| subchannel also takes a user defined attribute, the so called device number. |
| Both subchannel number and device number cannot exceed 65535. During sysfs |
| initialisation, the information about control unit type and device types that |
| imply specific I/O commands (channel command words - CCWs) in order to operate |
| the device are gathered. Device drivers can retrieve this set of hardware |
| information during their initialization step to recognize the devices they |
| support using the information saved in the struct ccw_device given to them. |
| This methods implies that Linux/390 doesn't require to probe for free (not |
| armed) interrupt request lines (IRQs) to drive its devices with. Where |
| applicable, the device drivers can use issue the READ DEVICE CHARACTERISTICS |
| ccw to retrieve device characteristics in its online routine. |
| |
| In order to allow for easy I/O initiation the CDS layer provides a |
| ccw_device_start() interface that takes a device specific channel program (one |
| or more CCWs) as input sets up the required architecture specific control blocks |
| and initiates an I/O request on behalf of the device driver. The |
| ccw_device_start() routine allows to specify whether it expects the CDS layer |
| to notify the device driver for every interrupt it observes, or with final status |
| only. See ccw_device_start() for more details. A device driver must never issue |
| ESA/390 I/O commands itself, but must use the Linux/390 CDS interfaces instead. |
| |
| For long running I/O request to be canceled, the CDS layer provides the |
| ccw_device_halt() function. Some devices require to initially issue a HALT |
| SUBCHANNEL (HSCH) command without having pending I/O requests. This function is |
| also covered by ccw_device_halt(). |
| |
| |
| get_ciw() - get command information word |
| |
| This call enables a device driver to get information about supported commands |
| from the extended SenseID data. |
| |
| :: |
| |
| struct ciw * |
| ccw_device_get_ciw(struct ccw_device *cdev, __u32 cmd); |
| |
| ==== ======================================================== |
| cdev The ccw_device for which the command is to be retrieved. |
| cmd The command type to be retrieved. |
| ==== ======================================================== |
| |
| ccw_device_get_ciw() returns: |
| |
| ===== ================================================================ |
| NULL No extended data available, invalid device or command not found. |
| !NULL The command requested. |
| ===== ================================================================ |
| |
| :: |
| |
| ccw_device_start() - Initiate I/O Request |
| |
| The ccw_device_start() routines is the I/O request front-end processor. All |
| device driver I/O requests must be issued using this routine. A device driver |
| must not issue ESA/390 I/O commands itself. Instead the ccw_device_start() |
| routine provides all interfaces required to drive arbitrary devices. |
| |
| This description also covers the status information passed to the device |
| driver's interrupt handler as this is related to the rules (flags) defined |
| with the associated I/O request when calling ccw_device_start(). |
| |
| :: |
| |
| int ccw_device_start(struct ccw_device *cdev, |
| struct ccw1 *cpa, |
| unsigned long intparm, |
| __u8 lpm, |
| unsigned long flags); |
| int ccw_device_start_timeout(struct ccw_device *cdev, |
| struct ccw1 *cpa, |
| unsigned long intparm, |
| __u8 lpm, |
| unsigned long flags, |
| int expires); |
| int ccw_device_start_key(struct ccw_device *cdev, |
| struct ccw1 *cpa, |
| unsigned long intparm, |
| __u8 lpm, |
| __u8 key, |
| unsigned long flags); |
| int ccw_device_start_key_timeout(struct ccw_device *cdev, |
| struct ccw1 *cpa, |
| unsigned long intparm, |
| __u8 lpm, |
| __u8 key, |
| unsigned long flags, |
| int expires); |
| |
| ============= ============================================================= |
| cdev ccw_device the I/O is destined for |
| cpa logical start address of channel program |
| user_intparm user specific interrupt information; will be presented |
| back to the device driver's interrupt handler. Allows a |
| device driver to associate the interrupt with a |
| particular I/O request. |
| lpm defines the channel path to be used for a specific I/O |
| request. A value of 0 will make cio use the opm. |
| key the storage key to use for the I/O (useful for operating on a |
| storage with a storage key != default key) |
| flag defines the action to be performed for I/O processing |
| expires timeout value in jiffies. The common I/O layer will terminate |
| the running program after this and call the interrupt handler |
| with ERR_PTR(-ETIMEDOUT) as irb. |
| ============= ============================================================= |
| |
| Possible flag values are: |
| |
| ========================= ============================================= |
| DOIO_ALLOW_SUSPEND channel program may become suspended |
| DOIO_DENY_PREFETCH don't allow for CCW prefetch; usually |
| this implies the channel program might |
| become modified |
| DOIO_SUPPRESS_INTER don't call the handler on intermediate status |
| ========================= ============================================= |
| |
| The cpa parameter points to the first format 1 CCW of a channel program:: |
| |
| struct ccw1 { |
| __u8 cmd_code;/* command code */ |
| __u8 flags; /* flags, like IDA addressing, etc. */ |
| __u16 count; /* byte count */ |
| __u32 cda; /* data address */ |
| } __attribute__ ((packed,aligned(8))); |
| |
| with the following CCW flags values defined: |
| |
| =================== ========================= |
| CCW_FLAG_DC data chaining |
| CCW_FLAG_CC command chaining |
| CCW_FLAG_SLI suppress incorrect length |
| CCW_FLAG_SKIP skip |
| CCW_FLAG_PCI PCI |
| CCW_FLAG_IDA indirect addressing |
| CCW_FLAG_SUSPEND suspend |
| =================== ========================= |
| |
| |
| Via ccw_device_set_options(), the device driver may specify the following |
| options for the device: |
| |
| ========================= ====================================== |
| DOIO_EARLY_NOTIFICATION allow for early interrupt notification |
| DOIO_REPORT_ALL report all interrupt conditions |
| ========================= ====================================== |
| |
| |
| The ccw_device_start() function returns: |
| |
| ======== ====================================================================== |
| 0 successful completion or request successfully initiated |
| -EBUSY The device is currently processing a previous I/O request, or there is |
| a status pending at the device. |
| -ENODEV cdev is invalid, the device is not operational or the ccw_device is |
| not online. |
| ======== ====================================================================== |
| |
| When the I/O request completes, the CDS first level interrupt handler will |
| accumulate the status in a struct irb and then call the device interrupt handler. |
| The intparm field will contain the value the device driver has associated with a |
| particular I/O request. If a pending device status was recognized, |
| intparm will be set to 0 (zero). This may happen during I/O initiation or delayed |
| by an alert status notification. In any case this status is not related to the |
| current (last) I/O request. In case of a delayed status notification no special |
| interrupt will be presented to indicate I/O completion as the I/O request was |
| never started, even though ccw_device_start() returned with successful completion. |
| |
| The irb may contain an error value, and the device driver should check for this |
| first: |
| |
| ========== ================================================================= |
| -ETIMEDOUT the common I/O layer terminated the request after the specified |
| timeout value |
| -EIO the common I/O layer terminated the request due to an error state |
| ========== ================================================================= |
| |
| If the concurrent sense flag in the extended status word (esw) in the irb is |
| set, the field erw.scnt in the esw describes the number of device specific |
| sense bytes available in the extended control word irb->scsw.ecw[]. No device |
| sensing by the device driver itself is required. |
| |
| The device interrupt handler can use the following definitions to investigate |
| the primary unit check source coded in sense byte 0 : |
| |
| ======================= ==== |
| SNS0_CMD_REJECT 0x80 |
| SNS0_INTERVENTION_REQ 0x40 |
| SNS0_BUS_OUT_CHECK 0x20 |
| SNS0_EQUIPMENT_CHECK 0x10 |
| SNS0_DATA_CHECK 0x08 |
| SNS0_OVERRUN 0x04 |
| SNS0_INCOMPL_DOMAIN 0x01 |
| ======================= ==== |
| |
| Depending on the device status, multiple of those values may be set together. |
| Please refer to the device specific documentation for details. |
| |
| The irb->scsw.cstat field provides the (accumulated) subchannel status : |
| |
| ========================= ============================ |
| SCHN_STAT_PCI program controlled interrupt |
| SCHN_STAT_INCORR_LEN incorrect length |
| SCHN_STAT_PROG_CHECK program check |
| SCHN_STAT_PROT_CHECK protection check |
| SCHN_STAT_CHN_DATA_CHK channel data check |
| SCHN_STAT_CHN_CTRL_CHK channel control check |
| SCHN_STAT_INTF_CTRL_CHK interface control check |
| SCHN_STAT_CHAIN_CHECK chaining check |
| ========================= ============================ |
| |
| The irb->scsw.dstat field provides the (accumulated) device status : |
| |
| ===================== ================= |
| DEV_STAT_ATTENTION attention |
| DEV_STAT_STAT_MOD status modifier |
| DEV_STAT_CU_END control unit end |
| DEV_STAT_BUSY busy |
| DEV_STAT_CHN_END channel end |
| DEV_STAT_DEV_END device end |
| DEV_STAT_UNIT_CHECK unit check |
| DEV_STAT_UNIT_EXCEP unit exception |
| ===================== ================= |
| |
| Please see the ESA/390 Principles of Operation manual for details on the |
| individual flag meanings. |
| |
| Usage Notes: |
| |
| ccw_device_start() must be called disabled and with the ccw device lock held. |
| |
| The device driver is allowed to issue the next ccw_device_start() call from |
| within its interrupt handler already. It is not required to schedule a |
| bottom-half, unless a non deterministically long running error recovery procedure |
| or similar needs to be scheduled. During I/O processing the Linux/390 generic |
| I/O device driver support has already obtained the IRQ lock, i.e. the handler |
| must not try to obtain it again when calling ccw_device_start() or we end in a |
| deadlock situation! |
| |
| If a device driver relies on an I/O request to be completed prior to start the |
| next it can reduce I/O processing overhead by chaining a NoOp I/O command |
| CCW_CMD_NOOP to the end of the submitted CCW chain. This will force Channel-End |
| and Device-End status to be presented together, with a single interrupt. |
| However, this should be used with care as it implies the channel will remain |
| busy, not being able to process I/O requests for other devices on the same |
| channel. Therefore e.g. read commands should never use this technique, as the |
| result will be presented by a single interrupt anyway. |
| |
| In order to minimize I/O overhead, a device driver should use the |
| DOIO_REPORT_ALL only if the device can report intermediate interrupt |
| information prior to device-end the device driver urgently relies on. In this |
| case all I/O interruptions are presented to the device driver until final |
| status is recognized. |
| |
| If a device is able to recover from asynchronously presented I/O errors, it can |
| perform overlapping I/O using the DOIO_EARLY_NOTIFICATION flag. While some |
| devices always report channel-end and device-end together, with a single |
| interrupt, others present primary status (channel-end) when the channel is |
| ready for the next I/O request and secondary status (device-end) when the data |
| transmission has been completed at the device. |
| |
| Above flag allows to exploit this feature, e.g. for communication devices that |
| can handle lost data on the network to allow for enhanced I/O processing. |
| |
| Unless the channel subsystem at any time presents a secondary status interrupt, |
| exploiting this feature will cause only primary status interrupts to be |
| presented to the device driver while overlapping I/O is performed. When a |
| secondary status without error (alert status) is presented, this indicates |
| successful completion for all overlapping ccw_device_start() requests that have |
| been issued since the last secondary (final) status. |
| |
| Channel programs that intend to set the suspend flag on a channel command word |
| (CCW) must start the I/O operation with the DOIO_ALLOW_SUSPEND option or the |
| suspend flag will cause a channel program check. At the time the channel program |
| becomes suspended an intermediate interrupt will be generated by the channel |
| subsystem. |
| |
| ccw_device_resume() - Resume Channel Program Execution |
| |
| If a device driver chooses to suspend the current channel program execution by |
| setting the CCW suspend flag on a particular CCW, the channel program execution |
| is suspended. In order to resume channel program execution the CIO layer |
| provides the ccw_device_resume() routine. |
| |
| :: |
| |
| int ccw_device_resume(struct ccw_device *cdev); |
| |
| ==== ================================================ |
| cdev ccw_device the resume operation is requested for |
| ==== ================================================ |
| |
| The ccw_device_resume() function returns: |
| |
| ========= ============================================== |
| 0 suspended channel program is resumed |
| -EBUSY status pending |
| -ENODEV cdev invalid or not-operational subchannel |
| -EINVAL resume function not applicable |
| -ENOTCONN there is no I/O request pending for completion |
| ========= ============================================== |
| |
| Usage Notes: |
| |
| Please have a look at the ccw_device_start() usage notes for more details on |
| suspended channel programs. |
| |
| ccw_device_halt() - Halt I/O Request Processing |
| |
| Sometimes a device driver might need a possibility to stop the processing of |
| a long-running channel program or the device might require to initially issue |
| a halt subchannel (HSCH) I/O command. For those purposes the ccw_device_halt() |
| command is provided. |
| |
| ccw_device_halt() must be called disabled and with the ccw device lock held. |
| |
| :: |
| |
| int ccw_device_halt(struct ccw_device *cdev, |
| unsigned long intparm); |
| |
| ======= ===================================================== |
| cdev ccw_device the halt operation is requested for |
| intparm interruption parameter; value is only used if no I/O |
| is outstanding, otherwise the intparm associated with |
| the I/O request is returned |
| ======= ===================================================== |
| |
| The ccw_device_halt() function returns: |
| |
| ======= ============================================================== |
| 0 request successfully initiated |
| -EBUSY the device is currently busy, or status pending. |
| -ENODEV cdev invalid. |
| -EINVAL The device is not operational or the ccw device is not online. |
| ======= ============================================================== |
| |
| Usage Notes: |
| |
| A device driver may write a never-ending channel program by writing a channel |
| program that at its end loops back to its beginning by means of a transfer in |
| channel (TIC) command (CCW_CMD_TIC). Usually this is performed by network |
| device drivers by setting the PCI CCW flag (CCW_FLAG_PCI). Once this CCW is |
| executed a program controlled interrupt (PCI) is generated. The device driver |
| can then perform an appropriate action. Prior to interrupt of an outstanding |
| read to a network device (with or without PCI flag) a ccw_device_halt() |
| is required to end the pending operation. |
| |
| :: |
| |
| ccw_device_clear() - Terminage I/O Request Processing |
| |
| In order to terminate all I/O processing at the subchannel, the clear subchannel |
| (CSCH) command is used. It can be issued via ccw_device_clear(). |
| |
| ccw_device_clear() must be called disabled and with the ccw device lock held. |
| |
| :: |
| |
| int ccw_device_clear(struct ccw_device *cdev, unsigned long intparm); |
| |
| ======= =============================================== |
| cdev ccw_device the clear operation is requested for |
| intparm interruption parameter (see ccw_device_halt()) |
| ======= =============================================== |
| |
| The ccw_device_clear() function returns: |
| |
| ======= ============================================================== |
| 0 request successfully initiated |
| -ENODEV cdev invalid |
| -EINVAL The device is not operational or the ccw device is not online. |
| ======= ============================================================== |
| |
| Miscellaneous Support Routines |
| ------------------------------ |
| |
| This chapter describes various routines to be used in a Linux/390 device |
| driver programming environment. |
| |
| get_ccwdev_lock() |
| |
| Get the address of the device specific lock. This is then used in |
| spin_lock() / spin_unlock() calls. |
| |
| :: |
| |
| __u8 ccw_device_get_path_mask(struct ccw_device *cdev); |
| |
| Get the mask of the path currently available for cdev. |